Figure 1 . Construction of a pC5 H6p WNV prM-M-E donor plasmid, pDS-2946-1-1 




pCSHSp 



Figure 2. Sequence of C5 H6p WNV prM-M-E C5 in pDS-2646-1-1 

^C5R 

1 TGAATGTTAA ATGTTATACT TTGGATGAAG CTATAAATAT GCATTGGAAA AATAATCCAT 

61 TTAAAGAAAG GATTCAAATA CTACAAAACC TAAGCGATAA TATGTTAACT AAGCTTATTC 

121 TTA^^CGACGC TTTAAATATA C4CAAATA4.4 CATAATTTTT GTATAACCTA ACAAAT.A.^CT 

181 AAAACATAAA AATAATAAAA GGAAATGTAA TATCGTAATT ATTTTACTCA GGAATGGGGT 

241 TAAATATTTA TATCACGTGT ATATCTATAC TGTTATCGTA TACTCTTTAC AATTACTATT 

3 01 ACGAATATGC AAGAGATAAT AAGATTACGT ATTTAAGAGA ATCTTGTCAT GATAATTGGG 

361 TACGACATAG TGATAAATGC TATTTCGCAT CGTTACATAA AGTCAGTTGG AAAGATGGAT 

421 TTGACAGATG TAACTTAATA GGTGCAAAAA TGTTAAATAA CAGCATTCTA TCGGAAGATA 

481 GGATACCAGT TATATTATAC AAAAATCACT GGTTGGATAA AACAGATTCT GCAATATTCG 

541 TAAAAGATGA AGATTACTGC GAATTTGTAA ACTATGACAA TAAAAAGCCA TTTATCTCAA 

601 CGACATCGTG TAATTCTTCC ATGTTTTATG TATGTGTTTC AGATATTATG AGATTACTAT 

661 AAACTTTTTG TATACTTATA TTCCGTAAAC TATATTAATC ATGAAGAAAA TGAAAAAGTA 

721 TAGAAGCTGT TCACGAGCGG TTGTTGAAAA CAACAAAATT ATACATTCAA GATGGCTTAC 

781 ATATACGTCT GTGAGGCTAT CATGGATAAT GACAATGCAT CTCTAAATAG GTTTTTGGAC 

841 AATGGATTCG ACCCTAACAC GGAATATGGT ACTCTACAAT CTCCTCTTGA AATGGCTGTA 

901 ATGTTCAAGA ATACCGAGGC TATAAAAATC TTGATGAGGT ATGGAGCTAA ACCTGTAGTT 

961 ACTGAATGCA CAACTTCTTG TCTGCATGAT GCGGTGTTGA GAGACGACTA CAAAATAGTG 

1021 AAAGATCTGT TGAAGAATAA CTATGTAAAC AATGTTCTTT ACAGCGGAGG CTTTACTCCT 

1081 TTGTGTTTGG CAGCTTACCT TAACAAAGTT AATTTGGTTA AACTTCTATT GGCTCATTCG 

1141 GCGGATGTAG ATATTTCAAA CACGGATCGG TTAACTCCTC TACATATAGC CGTATCAAAT 

1201 AAAAATTTAA CAATGGTTAA ACTTCTATTG AACAAAGGTG CTGATACTGA CTTGCTGGAT 

1261 AACATGGGAC GTACTCCTTT AATGATCGCT GTACAATCTG GAAATATTGA AATATGTAGC 

1321 ACACTACTTA AAAAAAATAA AATGTCCAGA ACTGGGAAAA ATTGATCTTG CCAGCTGTAA 

1381 TTCATGGTAG AAAAGAAGTG CTCAGGCTAC TTTTCAACAA AGGAGCAGAT GTAAACTACA 

1441 TCTTTGAAAG AAATGGAAAA TCATATACTG TTTTGGAATT GATTAAAGAA AGTTACTCTG 

1501 AGACACAAAA GAGGTAGCTG AAGTGGTACT CTCAAAAAGG TACGTGACTA ATTAGCTATA 

1561 AAAAGGATCC GGGTTAATTA ATTAGTCATC AGGCAGGGCG AGAACGAGAC TATCTGCTCG 



r=>H6p 

1621 TTAATTAATT AGAGCTTCTT_TATTCTATA_C TTAAiC^AA^ AAAATA AATA CAAAGGTTCT 
1681 TGAGGGTTGT GTTAAATTGA AAGCGAGAJ^.h TAATCATAAA TTATTTCATT AT CGCGATAT 

^ WNV caps id leader 
M TGI A V M I G L I A S V 
174 1 CCGTTAAGTT TGT ATCGTAA TGACCGGAAT TGCAGTCATG ATTGGCCTGA TCGCCAGCGT 

^ WNV prM 

. G A V TLSN FQG KVM MTVN ATD 
1801 .AGGAGCAGTT ACCCTCTCTA ACTTCCAAGG GAAGGTGATG ATGACGGTAA ATGCTACTGA 

,VTD VITI PTA AGK NLCI VRA 
1861 CGTCACAGAT GTCATCACGA TTCCAACAGC TGCTGGAAAG AACCTATGCA TTGTCAGAGC 

.MDV GYMC DDT ITY ECPV LSA 
1921 AATGGATGTG GGATACATGT GCGATGATAC TATCACTTAT GAATGCCCAG TGCTGTCGGC 

.GND PEDI D.CW CTK SAVY VRY 
1981 TGGTAATGAT CCAGAAGACA TCGACTGTTG GTGCACAAAG TCAGCAGTCT ACGTCAGGTA 

=> WNV M 

.GRC TKTR HSR RSR RSLT V Q T 
2 041 TGGAAGATGC ACCAAGACAC GCCACTCAAG ACGCAGTCGG AGGTCACTGA CAGTGCAGAC 

. H G E S T L A N K K G A W MOST K A T 
2101 ACACGGAGAA AGCACTCTAG CGAACAAGAA GGGGGCTTGG ATGGACAGCA CCAAGGCCAC 

, R Y L V K T E S W I L R N P G Y A L V A 
2161 AAGGTATTTG GTAAAAACAG AATCATGGAT CTTGAGGAAC CCTGGATATG CCCTGGTGGC 

.AVI G W M L G S N T M Q R V V F V V L 
2221 AGCCGTCATT GGTTGGATGC TTGGGAGCAA CACCATGCAG AGAGTTGTGT TTGTCGTGCT 

ir> WNV E 

. L L L V A P A Y S F NCL GMSN RDF 
2281 ATTGCTTTTG GTGGCCCCAG CTTACAGCTT CAACTGCCTT GGAATGAGCA ACAGAGACTT 

.LEG VSGA TWV DLV LEGD SCV 
2341 CTTGGAAGGA GTGTCTGGAG CAACATGGGT GGATTTGGTT CTCGAAGGCG ACAGCTGCGT 

.TIM SKD K PTI DVK MMNM EAA 
24 01 GACTATCATG TCTAAGGACA AGCCTACCAT CGATGTGAAG ATGATGAATA TGGAGGCGGC 

.NLA EVRS YCY LAT VSDL STK 
2461 CAACCTGGCA GAGGTCCGCA GTTATTGCTA TTTGGCTACC GTCAGCGATC TCTCCACCAA 

.AAC PTMG EAH NDK RADP AFV 
2521 AGCTGCGTGC CCGACCATGG GAGAAGCTCA CAATGACAAA CGTGCTGACC CAGCTTTTGT 

.CRQ GVVD RGW GNG CGLF GKG 
2581 GTGCAGACAA GGAGTGGTGG ACAGGGGCTG GGGCAACGGC TGCGGACTAT TTGGCAAAGG 

.SID TCAK FAC STK AIGR TIL 
2641 AAGCATTGAC ACATGCGCCA AATTTGCCTG CTCTACCAAG GCAATAGGAA GAACCATCTT 



mutated T5NT 

.KEN IKYE VAI FVH GPTT VES 
2701 GAAAGAGAAT ATCAAGTACG AAGTGGCCA T CTTCGT GCAC GGACCAACTA CTGTGGAGTC 

.HGN YSTQ VGA TQA GRFS I TP 
2761 GCACGGAAAC TACTCCACAC AGGTTGGAGC CACTCAGGCA GGGAGATTCA GCATCACTCC 

.AAP SYTL K LG EYG EVTV DCE 
2 821 TGCGGCGCCT TCATACACAC TAAAGCTTGG AGAATATGGA GAGGTGACAG TGGACTGTGA 

. P R S G I D T NAY Y V M T V G T K T F 
2 881 ACCACGGTCA GGGATTGACA CCAATGCATA CTACGTGATG ACTGTTGGAA CAAAGACGTT 

,LVH REWF MDL NLP WSSA GST 

2 941 CTTGGT.CCAT CGTGAGTGGT TCATGGACCT CAACCTCCCT TGGAGCAGTG CTGGAAGTAC 

.VWR NRET LME FEE PHAT KQS 
3001 TGTGTGGAGG AACAGAGAGA CGTTAATGGA GTTTGAGGAA CCACACGCCA CGAAGCAGTC 

.VIA LGSQ EGA LHQ ALAG AIP 

3 061 TGTGATAGCA TTGGGCTCAC AAGAGGGAGC TCTGCATCAA GCTTTGGCTG GAGCCATTCC 

.VEF SSNT VKL TSG HLKC RVK 
3121 TGTGGAATTT TCAAGCAACA CTGTCAAGTT GACGTCGGGT CATTTGAAGT GTAGAGTGAA 

.MEK LQLK GTT YGV CSKA FKF 
3181 GATGGAAAAA TTGCAGTTGA AGGGAACAAC CTATGGCGTC TGTTCAAAGG CTTTCAAGTT 

.LGT PADT GHG TVV LELQ YTG 
3241 TCTTGGGACT CCCGCAGACA CAGGTCACGG CACTGTGGTG TTGGAATTGC AGTACACTGG 

.TDG PCKV PIS SAA SLND LTP 
33 01 CACGGATGGA CCTTGCAAAG TTCCTATCTC GTCAGCGGCT TCATTGAACG ACCTAACGCC 

-VGR LVTV NPF VSV ATAN AKV 

33 61 AGTGGGCAGA TTGGTCACTG TCAACCCTTT TGTTTCAGTG GCCACGGCCA ACGCTAAGGT 

.LIE LEPP FGD SYI VVGR GEQ 
3421 CCTGATTGAA TTGGAACCAC CCTTTGGAGA CTCATACATA GTGGTGGGCA GAGGAGAACA 

.QIN HHWH KSG SSI GKAF TTT 

34 81 ACAGATCAAT CACCATTGGC ACAAGTCTGG AAGCAGCATT GGCAAAGCCT TTACAACCAC 

.LKG AQRL AAL GDT AWDF GSV 
3541 CCTCAAAGGA GCGCAGAGAC TAGCCGCTCT AGGAGACACA GCTTGGGACT TTGGATCAGT 

.GGV FTSV GKA VHQ VFGG AFR 
3 601 TGGAGGGGTG TTCACCTCAG TTGGGAAGGC TGTCCATCAA GTGTTCGGAG GAGCATTCCG 

.SLF GGMS WIT QGL LGAL LLW 
3661 CTCACTGTTC GGAGGCATGT CCTGGATAAC GCAAGGATTG CTGGGGGCTC TCCTGTTGTG 

.MGI NARD RSI ALT FLAV GGV 
3721 GATGGGCATC AATGCTCGTG ATAGGTCCAT AGCTCTCACG TTTCTCGCAG TTGGAGGAGT 

.LLF LSVN VHA 
3781 TCTGCTCTTC CTCTCCGTGA ACGTGCACGC TTAATTTTTA TCTAGAATCG ATCCCGGGTT 



< " • ' ^ C5L 

3841 TTTATGACTA GTTAATCACG GCCGCCTTAT AAAGATCTAA AATGCATAAT TTCTAAATAA 

3 901 TGAAAAAAAG TACATCATGA GCAACGCGTT AGTATATTTT ACAATGGAGA TTAACGCTCT 

3 961 ATACCGTTCT ATGTTTATTG ATTCAGATGA TGTTTTAGAA AAGAAAGTTA TTGAATATGA 

4 021 AAACTTTAAT GAAGATGAAG ATGACGACGA TGATTATTGT TGTAAATCTG TTTTAGATGA 
4 081 AGAAGATGAC GCGCT.^AAGT ATACTATGGT TACA^^^GTAT AAGTCTATAC rACTA.kTGGC 
4 141 GACTTGTGCA AGAAGGTATA GTATAGTGAA AATGTTGTTA GATTATGATT ATGAAAAACC 
4 201 AAATAAATCA GATCCATATC TAAAGGTATC TCCTTTGCAC ATAATTTCAT CTATTCCTAG 
4 261 TTTAGAATAC 



Rgure 3. Construction of a pF8 H6p WNV prM-M-E donor plasmid, pSL-5513-l-i-l. 




pF8H6pWNVpfh(-/M^ 



ure 4. Sequence of F8 H6p WNV prM-M-E F8 in pSL-5513-1-1-1 . 

=> F8R 

1 GACCCTTTAC AAGAATAAAA GAAGAAACAA CTGTGAAATA GTTTATAAAT GTAATTCGTA 

61 TGCAGAAAAC GATAATATAT TTTGGTATGA GAAATCTA.^ GGAGACATAG TTTGTATAGA 

121 CATGCGCTCT TCCGATGAGA TATTCGATGC TTTTCTAATG TATCATATAG CTACAAGATA 

181 TGCCTATCAT GATGATGATA TATATCTACA AATAGTGTTA TATTATTCTA ATAATCAAAA 

241 TGTTATATCr TATATTACGA AAAATAAATA CGTTAAGTAT ATAAGAAATA AAACTAGAGA 

301 CGATATTCAT AAAGTAAAAA TATTAGCTCT AGAAGACTTT ACAACGGAAG AAATATATTG 

361 TTGGATTAGT AATATATAAC AGCGTAGCTG CACGGTTTTG ATCATTTTCC AACAATATAA 

421 ACCAATGAAG GAGGACGACT CATCAAACAT AAATAACATT CACGGAAAAT ATTCAGTATC 

481 AGATTTATCA CAAGATGATT ATGTTATTGA ATGTATAGAC GGATCTTTTG ATTCGATCAA 

541 GTATAGAGAT ATAAAGGTTA TAATAATGAA GAATAACGGT TACGTTAATT GTAGTAAATT 

601 ATGTAAAATG CGGAATAAAT ACTTTTCTAG ATGGTTGCGT CTTTCTACTT CTAAAGCATT 

661 ATTAGACATT TACAATAATA AGTCAGTAGA TAATGCTATT GTTAAAGTCT ATGGTAAAGG 

721 TAAGAAACTT ATTATAACAG GATTTTATCT CAAACAAAAT ATGATACGTT ATGTTATTGA 
781 ■ GTGGATAGGG GATGATTTTA CAAACGATAT ATACAAAATG ATTAATTTCT ATAATGCGTT 

841 ATTCGGTAAC GATGAATTAA AAATAGTATC CTGTGAAAAC ACTCTATGCC CGTTTATAGA 

901 ACTTGGTAGA TGCTATTATG GTAAAAAATG TAAGTATATA CACGGAGATC AATGTGATAT 

961 CTGTGGTCTA TATATACTAC ACCCTACCGA TATTAACCAA CGAGTTTCTC ACAAGAAAAC 

1021 TTGTTTAGTA GATAGAGATT CTTTGATTGT GTTTAAAAGA AGTACCAGTA AAAAGTGTGG 

1081 CATATGCATA GAAGAAATAA ACAAAAAACA TATTTCCGAA CAGTATTTTG GAATTCTCCC 

1141 AAGTTGTAAA CATATTTTTT GCCTATCATG TATAAGACGT TGGGCAGATA CTACCAGAAA 

1201 TACAGATACT GAAAATACGT GTCCTGAATG TAGAATAGTT TTTCCTTTCA TAATACCCAG 

1261 TAGGTATTGG ATAGATAATA AATATGATAA AAAAATATTA TATAATAGAT ATAAGAAAAT 

1321 GATTTTTACA AAAATACCTA TAAGAACAAT AAAAATATAA TTACATTTAC GGAAAATAGC 

1381 TGGTTTTAGT TTACCAACTT AGAGTAATTA TCATATTGAA TCTATATTGC TAATTAGCTA 

1441 ATAAAAACCC GGGTTAATTA ATTAGTCATC AGGCAGGGCG AGAACGAGAC TATCTGCTCG 

=> H6p 

1501 TTAATTAATT AGAGCTTCTT TATTCTATAC TTAAAAAGTG AAAATAAATA CAAAGGTTCT 



1561 TGAGGGTTGT GTTAAATTGA AAGCGAGAAA TAATCATAAA TTATTTCATT ATCGCGATAT 

=> WNV caps id leader 
M TGI A V M I G L I A S V 
1621 CCGTTAAGTT TGTATCGTAA TGACCGGAAT TGCAGTCATG ATTGGCCTGA TCGCCAGCGT 

rc> WNV prM 

. G A V TLSN FQG KVM MTV. N ATD 
1681 AGGAGCGGTT ACCCTCTCTA ACTTCCAAGG GAAGGTGATG ATGACGGTAA ATGCTACTGA 

.VXD VITI PTA AGK NLCI VRA 
1741 CGTCACAGAT GTCATCACGA TTCCAACAGC TGCTGdAAAG AACCTATGCA TTGTCAGAGC 

.MDV GYMC DDT ITY ECPV LSA 
1801 AATGGATGTG GGATACATGT GCGATGATAC TATCACTTAT GAAT.GCCCAG TGCTGTCGGC 

.GND PEDI DCW CTK SAVY VRY 
1861 TGGTAATGAT CCAGAAGACA TCGACTGTTG GTGCACAAAG TCAGCAGTCT ACGTCAGGTA 

=> WNV M 

.GRC TKTR HSR*RSR RSLT V Q T 
1921 TGGAAGATGC ACCAAGACAC GCCACTCAAG ACGCAGTCGG AGG TCACTGA CAGTGCAGAC 

. H G E S T L A N K K G A W MOST K A T 
1981 ACACGGAGAA AGCACTCTAG CGAACAAGAA GGGGGCTTGG ATGGACAGCA CCAAGGCCAC 

. R Y L V K T E S W I L R N P G Y A L V A 
2041 AAGGTATTTG GTAAAAACAG AATCATGGAT CTTGAGGAAC CCTGGATATG CCCTGGTGGC 

, A V I G W M L G S N T M Q R V V F V V L 
2101 AGCCGTCATT GGTTGGATGC TTGGGAGCAA CACCATGCAG AGAGTTGTGT TTGTCGTGCT 

=> WNV E 

.LLL V A P A Y S F NCL GMSN RDF 
2161 ATTGCTTTTG GTGGCCCCAG CTTACAGCTT CAACTGCCTT GGAATGAGCA ACAGAGACTT 

.LEG VSGA TWV DLV LEGO SCV 
2221 CTTGGAAGGA GTGTCTGGAG CAACATGGGT GGATTTGGTT CTCGAAGGCG ACAGCTGCGT 

.TIM SKDK PTI DVK MMNM EAA 
22 81 GACTATCATG TCTAAGGACA AGCCTACCAT CGATGTGAAG ATGATGAATA TGGAGGCGGC 

.NLA EVRS YCY LA T VSDL STK 
2 341 CAACCTGGCA GAGGTCCGCA GTTATTGCTA TTTGGCTACC GTCAGCGATC TCTCCACCAA 

.AAC PTMG EAH NDK RADP AFV 
2401 AGCTGCGTGC CCGACCATGG GAGAAGCTCA CAATGACA/^A CGTGCTGACC CAGCTTTTGT 

.CRQ GVVD RGW GNG CGLF GKG 
2461 GTGCAGACAA GGAGTGGTGG ACAGGGGCTG GGGCAACGGC TGCGGACTAT TTGGCAAAGG 

.SID TCAK FAC STK AIGR TIL 
2521 AAGCATTGAC ACATGCGCCA AATTTGCCTG CTCTACCAAG GC7VATAGGAA GAACCATCTT 

.KEN IKYE VAI F VH GPTT VES 
2581 GAAAGAGAAT ATCAAGTACG AAGTGGCCAT CTTCGTGCAC GGACCAACTA CTGTGGAGTC 



.HGN YST Q VGA 
2641 GCACGGAAAC TACTCCACAC AGGTTGGAGC 



.AAP SYTL KL G 
2701 TGCGGCGCCT TCATACACAC TAAAGCTTGG 

.PRS GIDT NAY 
2761 • ACCACGGTCA GGGATTGACA CCAATGCATA 

.LVH REWF MDL 
2 821 CTTGGTCCAT CGTGAGTGGT TCATGGACCT 

.VWR NR ET LME 
2881 TGTGTGGAGG AACAGAGAGA CGTTAATGGA 

.VIA LG SQ EGA 

2 941 TGTGATAGCA TTGGGCTCAC AAGAGGGAGC 

.VEF SSNT VKL 

3 001 TGTGGAATTT TCAAGCAACA CTGTCAAGTT 

.MEK LQLK GTT 
3061 GATGGAAAAA TTGCAGTTGA AGGGAACAAC 

.LGT PADT GHG 
3121 TCTTGGGACT CCCGCAGACA CAGGTCACGG 

.TDG PCKV PIS 
3181 CACGGATGGA CCTTGCAAAG TTCCTATCTC 

.VGR LVTV NPF 
3241 AGTGGGCAGA TTGGTCACTG TCAACCCTTT 

.LIE LEPP FGD 
3 301 CCTGATTGAA TTGGAACCAC CCTTTGGAGA 

.QIN HHWH KSG 
3361 ACAGATCAAT CACCATTGGC ACAAGTCTGG 

.LKG AQRL AAL 
3421 CCTCAAAGGA GCGCAGAGAC TAGCCGCTCT 

,GGV FTSV GKA 
34 81 TGGAGGGGTG TTCACCTCAG TTGGGAAGGC 

.SLF G GMS WIT 
3541 CTCACTGTTC GGAGGCATGT CCTGGATAAC 

.MGI WARD RSI 
3601 GATGGGCATC AATGCTCGTG ATAGGTCCAT 

.LLP LSVN VHA 
3661 TCTGCTCTTC CTCTCCGTGA ACGTGCACGC 



TQA GRFS ITP 
CACTCAGGCA GGGAGATTCA GCATCACTCC 



EYG EVTV DCE 
AGAATATGGA GAGGTGACAG TGGACTGTGA 

YVM TVGT KTF 
CTACGTGATG ACTGTTGGAA CAAAGACGTT 

NLP WSSA GST 
CAACCTCCCT TGGAGCAGTG CTGGAAGTAC 

FEE PH AT KQS 
GTTTGAGGAA CCACACGCCA CGAAGCAGTC 

LHQ ALAG AIP 
TCTGCATCAA GCTTTGGCTG GAGCCATTCC 

TSG HLKC RVK 
GACGTCGGGT CATTTGAAGT GTAGAGTGAA 

YGV CSKA FKF 
CTATGGCGTC TGTTCAAAGG CTTTCAAGTT 

TVV LELQ YTG 
CACTGTGGTG TTGGAATTGC AGTACACTGG 

SAA SLND.LTP 
GTCAGCGGCT TCATTGAACG ACCTAACGCC 

VSV ATAN AKV 
TGTTTCAGTG GCCACGGCCA ACGCTAAGGT 

SYI VVGR GEQ 
CTCATACATA GTGGTGGGCA GAGGAGAACA 

SSI GKA'F T. TT 
AAGCAGCATT GGCAAAGCCT TTACAACCAC 

GDT AWDF GSV 
AGGAGACACA GCTTGGGACT TTGGATCAGT 

VHQ VFGG AFR 
TGTCCATCAA GTGTTCGGAG GAGCATTCCG 

QGL LGAL LLW 
GCAAGGATTG CTGGGGGCTC TCCTGTTGTG 

ALT FLAV GGV 
AGCTCTCACG TTTCTCGCAG TTGGAGGAGT 



TTAATTTTTA TCTAGAGTCG AGTTTTTATT 



^ F8L 

3721 GACTAGTTAA TCATAAGATA AATAATATAC AGCATTGTAA CCATCGTCAT CCGTTATACG 

3781 GGGAATAATA TTACCATACA GTATTATTAA ATTTTCTTAC GAAGAATATA GATCGGTATT 

3 841 TATCGTTAGT TTATTTTACA TTTATTAATT AAACATGTCT ACTATTACCT GTTATGGAAA 

3 901 TGACAAATTT AGTTATATAA TTTATGATAA AATTAAG.ATA ATAATAATGA AATCAAATAA 

3 961 TTATGTAAAT GCTACTAGAT TATGTGAATT ACGAGGAAGA AAGTTTACGA ACTGGAA.^AA 

4 021 ATTAAGTGAA TCTAAAATAT TAGTCGATAA TGTAAAAAAA ATAAATGATA AAACTAACCA 
4 081 GTTAAAAACG GATATGATTA TATACGTTAA GGATATTGAT CATAAAGGAA GAGATACTTG 
4141 CGGTTACTAT GTACACCAAG ATCTGGTATC TTCTATATCA AATTGGATAT CTCCGTTATT 
4 201 CGCCGTTAAG GTAAATAAAA TTATTAACTA TTATATATGT AATGAATATG ATATACGACT 
4 261 TAGCGAAATG GAATCTGATA TGACAGAAGT AATAGATGTA GTTGATAAAT TAGTAGGAGG 
4 321 ATACAATGAT GAAATAGCAG AAATAATATA TTTGTTTAAT AAATTTATAG AAAAATATAT 
4381 TGCTAACATA TCGTTATCAA CTGAATTATC TAGTATATTA AATAATTTTA TAAATTTTAA 
4441 TAAAAAATAC AATAACGACA TAAAAGATAT TAAATCTTTA ATTCTTGATC TGAAAAACAC 
4 501 ATCTATAAAA CTAGATAAAA AGTTATTCGA TAAAGATAAT AATGAATCGA ACGATGAAAA 
4 561 ATTGGAAACA GAAGTTGATA AGCTAATTTT TTTCATCTAA ATAGTATTAT TTTATTGAAG 
4 621 TACGAAGTTT TACGTTAGAT AAATAATAAA GGTCGATTTT TACTTTGTTA AATATCAAAT 
4 681 ATGTCATTAT CTGATAAAGA TACAAAAACA CACGGTGATT- ATCAACCATC TAACGAACAG 
4 741 ATATTACAAA AAATACGTCG GACTATGGAA AACGAAGCTG ATAGCCTCAA TAGAAGAAGC 
4 801 ATTAAAGAAA TTGTTGTAGA TGTTATGAAG AATTGGGATC ATCCTCTCAA CGAAGAAATA 
4861 GATAAAGTTC TAAACTGGAA AAATGATACA TTAAACGATT TAGATCATCT AAATACAGAT 
4 921 GATAATATTA AGGAAATCAT ACAATGTCTG ATTAGAGAAT TTGCGTTTAA AAAGATCAAT 
4 981 TCTATTATGT • ATAGTTATGC TATGGTAAAA CTCAATTCAG ATAACGAAAC ATTGAAAGAT 
5041 AAAATTAAGG ATTATTTTAT AGAAACTATT CTTAAAGACA AACGTGGTTA TAAACAAAAG 
5101 CCATTACCC 



Immunoblot analysis of the expression of VVNV proteins from pox recombinants in 



Pellets Supernatants 



M123456 7 8 12345678M 




Lane 1 : ALVAC 
Lane 2: Fowl pox 
Lane 3: vCP2017 24h harvest 
Lane 4: vCP2017 48h harvest 
Lane 5: vCP2018 24h harvest 
Lane 6; vCP2018 48h harvest 
Lane 7: vFP2000 24h harvest 
Lane 8: vFP2000 48h harvest 



Figure 5. 
CEFs 



vCP2017 = 
vGP2018 = 
VFP2000 = 



ALVAC WNV prM-M-E 
ALVAC-2 WNV prM-M-E 
Fowlpox WNV prM-M-E 



Figure 6. Immunoblot analysis of the expression of WNV proteins from pox recombinants in 
BHK cells 



1 2 3 4 5 6 7 8 9 10 11 12 




Lane 1: VFP2000.3.2.1.2.1 pellet 
Lane 2: vCP201 7.3.3 pellet 
Lane 3: vCP201 8.6.3.3.2 pellet 
Lane 4: mock infected BHK pellet 
Lane 5: pTriEx-WNV transfection pellet 
Lane 6: mock transfected BHK pellet 
Lane 7: vFP2000.3.2.1.2.1 supt 
Lane 8: vCP201 7.3.3 supt 
Lane 9: vCP201 8.6.3.3.2 supt 
Lane 10: mock infected BHK supt 
Lane 11 : pTriEx-WNV transfection supt 
Lane 12: mock transfection supt 



vCP2017 = ALVACWNV 
VCP2018 = ALVAC-2 WNV 
VFP2000 = Fowlpox WNV 



Figure 7, Construction of pVR1 01 2 WNV prM-M-E, pSL-5448-1 -1 . 




PVR1012 WNV prM-M-E (no Kozak) 



I Ollgos 7743.SL/7744.SLI 
Psl I - EooR V Pst l-EcoRV Kozak 




pVR1012 WNV prM-M'B 



Figure 8. Nucleotide sequence and translation of the WNV prM-M-E region in pSL-5448-1-1 . 
pVR1012 WNV prM-M'E, 



PstI Kozak ^ tVNV capsid leadei: 

M G S T G J A V M I G L I. A S V • 
1 |CTGCAG|CC GC CACCATGG GA TCPlACCGGAA TTGCAGTCAT GATTGGCCTG ATCGCCAGCG 

^ WNV prM 

..gav tls nfqg kvm -mtv natd- 
63 TAGGAGCAGT taccctctct aacttccaag ggaaggtgat gatgacggta aatgctactg 

,.VTD VIT IPTA AGK NLC IVRA- 
121 ACGTCACAGA TGTCATCACG ATTCCAACAG CTGCTGGAAA GAACCTATGC ATTGTCAGAG 

..MDV GYM CDDT ITY ECP VLSA- 
181 CAATGGATGT GGGATACATG TGCGATGATA CTATCACTTA TGAATGCCCA GTGCTGTCGG 

..GND PED ID CW CTK S AV YVRY- 
241 CTGGTAATGA TCCAGAAGAC ATCGACTGTT GGTGCACAAA GTCAGCAGTC TACGTCAGGT 

=> WNV M 

..GRC TKT RHSR RSR R S L T V Q T • 
301 ATGGAAGATG CACCAAGACA CGCCACTCAA GACGCAGTCG GAGGTCACTG ACAGTGCAGA 

, . H G E STL A N K K G A W M D S T K A T - 
361 CACACGGAGA AAGCACTCTA GCGAACAAGA AGGGGGCTTG GATGGACAGC ACCAAGGCCA 

. . R Y L V K T E S W I L R N P G Y A L V A - 
421 CAAGGTATTT GGTAAAAACA GAATCATGGA TCTTGAGGAA CCCTGGATAT GCCCTGGTGG 

..AVI G W M L G S N T M Q R V V F V V L - 
4 81 CAGCCGTCAT TGGTTGGATG CTTGGGAGCA ACACCATGCA GAGAGTTGTG TTTGTCGTGC 

=> WNV E 

..LLL V A P A Y S F NCL GMS NRDF- 
541 TATTGCTTTT GGTGGCCCCA GCTTACAGCT TCAACTGCCT TGGAATGAGC AACAGAGACT 

,,LEG VSG ATWV DLV LEG DSCV- 
601 TCTTGGAAGG AGTGTCTGGA GCAACATGGG TGGATTTGGT TCTCGAAGGC GACAGCTGCG 

Clal 

..TIM SKD KPT I D VK MMN MEAA- 
661 TGACTATCAT GTCTAAGGAC AAGCCTACC lA TCGATp TGAA GATGATGAAT ATGGAGGCGG 

..NLA EVR SYCY LAT VSD LSTK- 
721 CCAACCTGGC AGAGiSTCCGC AGTTATTGCT ATTTGGCTAC CGTCAGCGAT CTCTCCACCA 

..AAC PT M GEAH NDK RAD PAFV- 
781 AAGCTGCGTG CCCGACCATG GGAGAAGCTG ACAATGACAA ACGTGCTGAC CCAGCTTTTG 

..CRQ GVV DRGW GNG CGL FGKG- 
841 TGTGCAGACA AGGAGTGGTG GACAGGGGCT GGGGCAACGG. CTGCGGACTA TTTGGCAAAG 

..SID TCA KFAC STK AIG RTIL- 
901 GAAGCATTGA CACATGCGCC AAATTTGCCT GCTCTACCAA GGCAATAGGA AGAACCATCT 

..KEN IKY EVAI FVH GPT TVES- 
961 TGAAAGAGAA TATCAAGTAC GAAGTGGCCA TTTTTGTCCA TGGACCAACT ACTGTGGAGT 

..HGN YST QVGA TQA GRF S ITP- 
1021 CGCACGGAAA CTACTCCACA CAGGTTGGAG CCACTCAGGC AGGGAGATTC AGCATCACTC 



..AAP SYT LKLG EYG EVT VOCE 
1081 CTGCGGCGCC TTCATACACA CTAAAGCTTG GAGAATATGG AGAGGTGACA GTGGACTGTG 

..PRS GID TNA Y YVM TVG TKTF 
1141 AACCACGGTC AGGGATTGAC ACCAATGCAT ACTACGTGAT GACTGTTGGA ACAAAGACGT 

. . L V H R E VJ F M D L N i . P W S S A G S T 
1201 TCTTGGTCCA TCGTGAGTGG TTCATGGACC TCAACCTCCC TTGGAGCAG'i^ GCTGGAAGTA 

. . V W R 'N R E T L M E FEE . P H A T K 0 S 
1261 CTGTGTGGAG GAACAGAGAG ACGTTAATGG AGTTTGAGGA ACCACACGCC ACGAAGCAGT 

..VIA LGS OEGA L HQ ALA GAI.P 
1321 CTGTGATAGC ATTGGGCTCA CAAGAGGGAG CTCTGCATCA AGCTTTGGCT GGAGCCATTC 

.,VEF SSN TVKL TSG HLK CRVK 

13 81 CTGTGGAATT TTCAAGCAAC ACTGTCAAGT TGACGTCGGG TCATTTGAAG TGTAGAGTGA 

..MEK LQL KGTT YGV CSK AFKF 

14 41 AGATGGAAAA ATTGCAGTTG AAGGGAACAA CCTATGGCGT CTGTTCAAAG GCTTTCAAGT 

..LGT PA D TGHG LEL QYTG 

1501 TTCTTGGGAC TCCCGCAGAC ACAGGTCACG GCACTGTGGT GTTGGAATTG CAGTACACTG 

,.TDG PCK VP IS SAA SLN DLTP 
1561 GCACGGATGG ACCTTGCAAA GTTCCTATCT CGTCAGCGGC TTCATTGAAC GACCTAACGC' 

..VGR LVT VNPF VSV ATA NAKV 
1621 CAGTGGGCAG ATTGGTCACT GTCAACCCTT TTGTTTCAGT GGCCACGGCC AACGCTAAGG 

..LIE LEP PFGD SYI VVG RGEQ 
1681 TCCTGATTGA ATTGGAACCA CCCTTTGGAG ACTCATACAT AGTGGTGGGC AGAGGAGAAC 

..QIN HHW HKSG SSI GKA FTTT 
1741 AACAGATCAA TCACCATTGG CACAAGTCTG GAAGCAGCAT TGGCAAAGCC TTTACAACCA 

..LKG AQR LAAL GDT AWD FGSV 
1801 CCCTCAAAGG AGCGCAGAGA CTAGCCGCTC TAGGAGACAC AGCTTGGGAC TTTGGATCAG 

..GGV FTS VGKA VHQ VFG GAFR 
1861 TTGGAGGGGT GTTCACCTCA GTTGGGAAGG CTGTCCATCA AGTGTTCGGA GGAGCATTCC 

..SLF .GGM SWIT QGL LGA LLLW 
1921 GCTCACTGTT CGGAGGCATG TCCTGGATAA CGCAAGGATT GCTGGGGGCT CTCCTGTTGT 

..MGI NAR DRSI ALT FLA VGGV 
1981 GGATGGGCAT CAATGCTCGT GATAGGTCCA TAGCTCTCAC GTTTCTCGCA GTTGGAGGAG 

Xbal 

.,LLF LSV NVH A 
2041 TTCTGCTCTT CCTCTCCGTG AACGTGCACG CTTAATTTTT A frCTAGA| 



FIGURE 9 

Sequence of pDS-2946-1-1 . pC5 H6p WNV prM-M-E. 

1 GCGCCCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAGCTGGCA 

61 CGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCT 

121' CACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAAT 

C5R 

181 TGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATGATTACGAATTGCGGCC 

241 GCAATTCTGAATGTTAAATGTTATACTTTGGATGAAGCTATAAATATGCATTGGAAAAAT 

301 AA TCCA TTTAAAGAAAGGA TTCAAA TACTACAAAACCTAAGCGA TAA TA TG TTAACTAAG 

361 CTTATTCTTAACGACGCTTTAAATATACACAAATAAACATAATTTTTGTATAACCT.AACA 

421 AATAACTAAAACATAAAAATAATAAAAGGAAATGTAATATCGTAATTATTTTACTCAGGA 

4 81 ATGGGGTTAAATATTTATATCACGTGTATATCTATACTGTTATCGTATACTCTTTACAAT 

541 TACT A TTA CGAA TA TGCAAGAGA TAA TAAGA TTACGTA TTTAAGAGAA TCTTGTCA TGA T 

601 AATTGGGTACGACATAGTGATAAATGCTATTTCGCATCGTTACATAAAGTCAGTTGGAAA 
661 • GATGGATTTGACAGATGTAACTTAATAGGTGCAAAAATGTTAAATAACAGCATTCTATCG 

721 GAAGATAGGATACCAGTTATATTATACAAAAATCACTGGTTGGATAAAACAGATTCTGCA 

781 ATATTCGTAAAAGATGAAGATTACTGCGAATTTGTAAACTATGACAATAAAAAGCCATTT 

841 ATCTCAACGACATCGTGTAATTCTTCCATGTTTTATGTATGTGTTTCAGATATTATGAGA 

901 TTACTATAAACTTTTTGTATACTTATATTCCGTAAACTATATTAATCATGAAGAAAATGA 

961 AAAAGTATAGAAGCTGTTCACGAGCGGTTGTTGAAAACAACAAAATTATACATTCAAGAT 

1021 GGCTTACATATACGTCTGTGAGGCTATCATGGATAATGACAATGCATCTCTAAATAGGTT 

1081 TTTGGACAATGGATTCGACCCTAACACGGAATATGGTACTCTACAATCTCCTCTTGAAAT 

1141 GGCTGTAATGTTCAAGAATACCGAGGCTATAAAAATCTTGATGAGGTATGGAGCTAAACC 

1201 TGTAGTTACTGAATGCACAACTTCTTGTCTGCATGATGCGGTGTTGAGAGACGACTACAA 

1261 AATAGTGAAAGATCTGTTGAAGAATAACTATGTAAACAATGTTCTTTACAGCGGAGGCTT 

1321 TACTCCTTTGTGrrTGGCAGCTrACCrTAACAAAGTTAATTTGGTTAAACrTCTATTGGC 

1381 TCATTCGGCGGATGTAGATATTTCAAACACGGATCGGTTAACTCCTCTACATATAGCCGT 

1441 ATCAAATAiiWUTTTAACAATGCTTAAACTTCTATTGAftCAAA 

1501 GCTGGA TAACA TGGGA TGTACTCCTTTAA TGA TCGCTGTACAA TCTGGAAA TA TTGAAA T 



1561 A TGTAGCACACTACTTAAAAAAAATAAAA TGTCCAGAACTGGGAAAAA TTGA TCTTGCCA 

1621 GCTGTAATTCATGGTAGAAAAGAAGTGCTCAGGCTACTTTTCAACAAAGGAGCAGATGTA 

1681 AACTACA TCTTTGAAAGAAA TGGAAAA TCA TA TA CTGTTTTGGAA TTGA TTAAAGAAAGT 

1741 TACTCTGAGACACAAAAGAGGTAGCTGAAGTGGTACTCTCA.^^GGTACGTGACTAATTAG 

1801 Cr^T.AAA.4;4GGATCCGGGTTAATTAATTAGTCATCAGGCAGGGCGAGAACGAGACTATCT 

H 6 C 

1861 GCTCGTTAATTAATTAGAGCTTCTTTATTCTAT ACTTAAAAAC'rGAAAATA AA-^'ACAAAG 

1921 GTTCTTGAGGGTTGTGTTAAATTGAAAGCGAGAAATAATCATAAATTATTTCATTATCGC 

zz> WNV caps id leader 
MTGIAVMIGLIA 
1981 GATATCCGTTAA.GTTTGTATCGTAATGACCGGAATTGCAGTCATGATTGGCCTGATCGCC 

=> WNV prM Start 
SVGAVTLSNFQGKVMMTVNA 
2041 AGCGTAGGAGGGGTTACCCTCTCTAACTTCCAAGGGAAGGTGATGATGACGGTAAATGCT 

TDVTDVITIPTAAGKNLCIV 
2101 ACTGACGTCACAGATGTCATC ACGATTCCAACAGCTGCTGGAAAGAACCTATGCATTGTC 

/ 

RAMDVGYMCDDTITYECPVL 
2161 AGAGCAATGGATGTGGGATACATGTGCGATGATACTATCACTTATGAATGCCCAGTGCTG 

SAGNDPEDIDCWCTK SAVYV 
2221 TCGGCTGGTAATGATCCAGAAGACATCGACTGTTGGTGCACAAAGTCAGCAGTCTACGTC 

r:> WNV M Start 

RYGRCTKTRHSRRSRRSLTV 
2281 AGGTATGGAAGATGCACCAAGACACGCCACTCAAGACGCAGTCGGAGG TCACTGACAGTG 

OTHGESTLANKKGAWMDSTK 

2341 CAGACACACGGAGAAAGCACTCTAGCGAACAAGAAGGGGGCTTGGATGGACAGCACCAAG 

ATRYLVKTESWILRNPGYA L 
2401 GCCACAAGGTATTTGGTAAAAACAGAATCATGGATCTTGAGGAACCCTGGATATGCCCTG 

VAAVIGWMLGSNTMQRVVFV 
2461 GTGGCAGCCGTCATTGGTTGGATGCTTGGGAGCAACACCATGCAGAGAGTTGTGTTTGTC 

=> WNV E start 

VLLLLVAPAYSFNCLGMSNR 
2521 GTGCrATTGCTTTTGGTGGCCCCAGCTTACAGCTTCAACTGCCTTGGAATGAGCAACAGA 

DFLEGVSGATWVDLVLEGDS 
2581 GACTTCTTGGAAGGAGTGTCTGGAGCAACATGGGTGGATTTGGTTCTCGAAGGCGACAGC 

CVTI MSKDKPTIDVKMMNME 
2641 TGCGTGACTATCATGTCTAAGGACAAGCCTACCATCGATGTGAAGATGATGAATATGG AG 



AANLAEVRSYCYLATVSDLS 
2701 GCGGCCAACCTGGCAGAGGTCCGCAGTTATTGCTATTTGGCTACCGTCAGCGATCTCTCC 

TKAACPTMGEAHNDKRADPA 
2761 ACCAAAGCTGCGTGCCCGACCATGGGAGAAGCTC ACAATGACAAACGTGCTGACCCAGCT 

FVCRQGVVDRG WGNGCGLFG 
2 821 TTTGTGTGCAGACAAGG AGTGGTGGACAGGGGCTGGGGCAACGGCTGCGG ACTATTTGGC 

KGSIDTCAKFACSTKAIGRT 
2 881 AAAGGAAGCATTGACACATGCGCCAAATTTGCCTGCTCTACCAAGGCAATAGGAAGAACC 

mutated T5NT 

I L KEN I KYEVAI FVHGPTTV 
2 941 ATCTTGAAAGAGAATATCAAGTACGAAGTGGCCATCTTCGTGCACGGACCAACTACTGTG 



ESHGNYSTQVGATQAG--RFSI 
3001 GAGTCGCACGGAAACTACTCCACACAGGTTGGAGCCACTCAGGCAGGGAGATTCAGCATC 

TPAAPSYTLKLGEYGEVTVD 
3 061 ACTCCTGCGGCGCCTTCATACACACTAAAGCTTGGAGAATATGGAGAGGTGACAGTGGAC 

CEPRSGIDTNAYYVMTVGTK 
3121 TGTGAACCACGGTCAGGGATTGACACCAATGCATACTACGTGATGACTGTTGGAACAAAG 

TFLVHREWFMDLNLPW SSAG 
3181 ACGTTCTTGGTCCATCGTGAGTGGTTCATGGACCTCAACCTCCCTTGGAGCAGTGCTGGA 

STVWRNRETLMEFEEPHATK 
3241 AGTACTGTGTGGAGGAACAGAGAGACGTTAATGGAGTTTGAGGAACCACACGCCACGAAG 

QSV IALGSQEGALHQALAGA 
3301 CAGTCTGTGATAGCATTGGGCTCACAAGAGGGAGCTCTGCATCAAGCTTTGGCTGGAGCC 

IPVEFSSNTVKLTSGHLKCR 
3361 ATTCCTGTGGAATTTTC AAGCAACACTGTCAAGTTG ACGTCGGGTCATTTGAAGTGTAGA 

VKMEKLQLKGTTYGVCSKAF 
3421 GTGAAGATGGAAAAATTGCAGTTGAAGGGAACAACCTATGGCGTCTGTTCAAAGGCTTTC 

KFLGTPADTGHGTVVIiELQY 
34 81 AAGTTTCTTGGGACTCCCGCAGACACAGGTCACGGCACTGTGGTGTTGGAATTGCAGTAC 

TGTDGP CKVPISSAASLNDL 
3541 ACTGGCACGGATGGACCTTGC AAAGTTCCTATCTCGTCAGCiSGCTTCATTGAACGACCTA 

TPVGRLVTVNPFVSVATANA 
3601 ACGCCAGTGGGCAGATTGGTCACTGTCAACCCTTTTGTTTCAGTGGCCACGGCCAACGCT 

KVLIELEPPFGDSYIVVGRG 
3661 AAGGTCCTGATTGAATTGGAACCACCCTTTGGAGACTCATACATAGTGGTGGGCAGAGGA 

EQQINHHWHKSGSSIGKAFT 
3721 GAACAACAGATCAATCACCATTGGCACAAGTCTGGAAGCAGCATTGGCAAAGCCTTTACA 



TTLKGAQRLAALGDTAWDF G 
37 81 ACCACCCTCAAAGGAGCGCAGAGACTAGCCGCTCTAGGAGACACAGCTTGGGACTTTGGA 

SVG GVFTSVGKAVHQ VFGGA 
3 841 TCAGTTGGAGGGGTGTTC ACCTCAGTTGGG AAGGCTGTCC ATC AAGTGTTCGGAGGAGCA 

FRSLFGGMSWITQGLLGAL.L, 
3 901 TTCCGCTCACTGTTCGG AGGCATGTCCTGGATAACGC AAGGATTGCTGGGGGCTCTCCTG 

LWMGINARDRSIALTFLAVG 

3 961 TTGTGGATGGGCATCAATGCTCGTGATAGGTCCATAGCTCTC ACGTTTCTCGCAGTTGGA 

^ C5L 

GVLLFLSVNVHA* 

4 021 GGAGTTCTGCTCTTCCTCTCCGTGAACGTGCACGCTTAATTTTTATCTAGAATCGATC CC 

4 081 GGGTTTTTATGACTAGTTAATCACGGCCGCTTATAAAGATCTAAAATGCATAATTTCTAA 

4141 ATAATGAAAAAAAGTACATCATGAGCAACGCGTTAGTATATTTTACAATGGAGATTAACG 

42 01 CTCTA TACCGTTCTA TGTTTA TTGA TTCAGA TGA TGTTTTAGAAAAGAAAGTTA TTGAA T 

4261 ATGAAAACTTTAATGAAGATGAAGATGACGACGATGATTATTGTTGTAAATCTGTTTTAG 

4 321 ATGAAGAAGATGACGCGCTAAAGTATACTATGGTTACAAAGTATAAGTCTATACTACTAA 

4 3 81 TGGCGACTTGTGCAAGAAGGTATAGTATAGTGAAAATGTTGTTAGATTATGATTATGAAA 

4441 AACCAAATAAATCAGATCCATATCTAAAGGTATCTCCTTTGCACATAATTTCATCTATTC 
4 501 . CTAGXTTAGAA TACCTGCAGCCAAGCTTGGCACTGGCCGTCGTTTTACAACGTCGTGACT 

4 561 GGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCT 

4 621 GGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATG 

4 681 GCGAATGGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCA 

4 741 TATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGCCCCGACACC 

4 801 CGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGAC 

4 861 AAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAAC 

4 921 GCGCGAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTAATGTCATGATAATAA 

4981 TGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTT 

5041 TATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGC 

5101 TTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTC 

5161 CCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAA 

5221 AAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCG 



5281 



GTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAG 



5341 TTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCC 

54 01 GCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTA 

54 61 CGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTG 

5521 CGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACA 

5 581 ACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATAC 

5641 CAAAGGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTAT 

5701 TAACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGG 

57 61 ATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATA 

5821 AATCTGGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTA 

5881 AGCCCTCCCGTATGGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAA 

5941 ATAG AC AGATCGCTGAGATAGGTGCCTCACTGATTAAGC ATTGGTAACTGTCAGACCAAG 

6001 TTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGG 

6061 TGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACT 

6121 GAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCG 
6181 • TAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATC 

6241 AAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATA 

6301 CTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTA 

6361 CATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTC 

6421 TTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGG 

64 81 GGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTAC 

6541 AGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGG 

6601 TAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGT 

6661 ATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCT 

6721 CGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGG 

6781 CCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATA 

6841 ACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCAGCCGAACGACCGAGCGCA 

6901 GCGAGTCAGTGAGCGAGGAAGCGGAAGA 



FIGURE 10 

Construction of pC5 H6p WNV prM-M-E donor plasmids with a truncated H6p +/- truncated 
WNV capsid leader sequence. 
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FIGURE 11 

ALVAC WNV constructs with truncated H6p -t-/- truncated leader sequence 
Explanation of terms: 

H6p (t) is the truncated H6p promoter deleted between Nru I and the 3'-end 

WNV-L (t) refers to the truncated WNV capsid leader, which is missing the initiating Met, so 
results in a shorter leader sequence 

H6p 5'-WNV sequence in actual vCP2017: 

H6p 

1861 GCTCGTTAATTAATTAGAGCTTCTTTATTCTATACTTAAAAAGTG A^AATAA^^ 

Nru I 

1921 GTTCTTGAGGGTTGTGTTAAJVTTGAAAGCGAGAAATA^ 

^ WNV capsid leader 
MTGIAVMIGLIA 
1981 Pa|tATCCGTTAAGTTTGTATCG TAATGACCGGAA rTGCAGTCATGATTGGCCTGArCGCC 

:=> WNV prM Start 
SVGAVTLSNFQGKVMMTV NA 
2 041 AGCGTAGGAGCGGTT ACCCTCTCTAACTTCCAAGGGAAGGTGATGATGACGGTAAATGCT 

TDVTDVITIPTAAGKNLCIV 
2101 ACTGACGTCACAGATGTCATCACGATTCC AACAGCTGCTGGAAAGAACCTATGC ATTGTC 



PGR primers for H6p (t) WNV prM-M-E WNV-L (t): 

=> WNV capsid 
Nrul TGIAVMIGL 
11344. SL 5' ATTATCGCGAACCGGAATTGCAGTCATGATTGGCCTG 



KPTIDVKM 
AAGCCTACCATCGATGTGAAGATG 
7 6 1 6 . SL 3 ' TTCGGATGG TAGCTA CACTTCTAC 

Cla I 



PGR primers for H6p (t) WNV prM-M-E: 

=^ WNV capsid 
Nrul MTGIAVMIGL 
1 1 3 4 5 . SL 5 ' ATTATCGCGAATGACCGGAATTGCAGTCATGATTGGCCTG 



FIGURE 12 



agtagttcgc ctgtgtgagc tgacaaactt agtagtgttt gtgaggatta acaacaatta 
60 

acacagtgcg agctgtttct tagcacgaag atctcg atg tct aag aaa cca gga 
114 

Met Ser Lys Lys Pro Gly 
1 5 

ggg ccc ggc aag age egg get gte aat atg eta aaa egc gga atg eec 
162 

Gly Pro Gly Lys Ser Arg Ala Val Asn Met Leu Lys Arg Gly Met Pro 
10 15 20 

ego gtg ttg tec ttg att gga ctg aag agg get atg ttg age etg ate 
210 

Arg Val Leu Ser Leu lie Gly Leu Lys Arg Ala Met Leu Ser Leu lie 
25 30 35 

gae gge aag ggg eca ata ega ttt gtg ttg get ete ttg geg tte tte 

258 

Asp Gly Lys Gly Pro lie Arg Phe Val Leu Ala Leu Leu Ala Phe Phe 
40 45 50 

agg tte aea gea att get eeg aec cga gca gtg ctg gat cga tgg aga 
306 

Arg Phe Thr Ala lie Ala Pro Thr Arg Ala Val Leu Asp Arg Trp Arg 
55 60 65 70 

9gt gtg aae aaa eaa aea geg atg aaa eae ett etg agt ttt aag aag 

354 

Gly Val Asn Lys Gin Thr Ala Met Lys His Leu Leu Ser Phe Lys Lys 
75 80 85 

gaa eta ggg aee ttg aec agt get ate aat egg egg age tea aaa eaa 
402 

Glu Leu Gly Thr Leu Thr Ser Ala lie Asn Arg Arg Ser Ser Lys Gin 
90 95 100 

aag aaa aga gga gga aag aee gga att gea gte atg att gge etg ate 
450 

Lys Lys Arg Gly Gly Lys Thr Gly lie Ala Val Met lie Gly Leu lie 
105 110 115 

gee age gta gga gea gtt aee ete tet aae tte eaa ggg aag gtg atg 

498 

Ala Ser Val Gly Ala Val Thr Leu Ser Asn Phe Gin Gly Lys Val Met 
120 125 130 



atg acg gta aat get act gac gtc aca gat gtc ate acg att cca aca 
546 

Met Thr Val Asn Ala Thr Asp Val Thr Asp Val He Thr He Pro Thr 

135 140 145 150 

get get gga aag aae eta tgc att gte aga gea atg gat gtg gga tae 

594 

Ala Ala Gly Lys Asn Leu Cys He Val Arg Ala Met Asp Val Gly Tyr 
155 160 165 

atg tge gat gat aet ate act tat gaa tgc cea gtg etg teg get ggt 
642 

Met Cys Asp Asp Thr He Thr Tyr Glu Cys Pro Val Leu Ser Ala Gly 

170 175 180 

aat gat cca gaa gac ate gac tgt tgg tge aca aag tea gea gtc tac 

690 

Asn Asp Pro Glu Asp He Asp Cys Trp Cys Thr Lys Ser Ala Val Tyr 
185 190 195 

gtc agg tat gga aga tgc ace aag aca cgc eac tea aga egc agt egg 
738 

Val Arg Tyr Gly Arg Cys Thr Lys Thr Arg His Ser Arg Arg Ser Arg 
200 205 210 

agg tea etg aca gtg eag aca eac gga gaa age aet eta geg aae aag 

786 

Arg Ser Leu Thr Val Gin Thr His Gly Glu Ser Thr Leu Ala Asn Lys 
215 220 225 230 

aag ggg get tgg atg gac age ace aag gee aca agg tat ttg gta aaa 
834 

Lys Gly Ala Trp Met Asp Ser Thr Lys Ala Thr Arg Tyr Leu Val Lys 
235 240 245 

aca gaa tea tgg ate ttg agg aae cet gga tat gee etg gtg gea gcc 
882 

Thr Glu Ser Trp He Leu Arg Asn Pro Gly Tyr Ala Leu Val Ala Ala 
250 255 260 

gtc att ggt tgg atg ett ggg age aac ace atg cag aga gtt gtg ttt 
930 

Val He Gly Trp Met Leu Gly Ser Asn Thr Met Gin Arg Val Val Phe 
265 270 275 

gtc gtg eta ttg ett ttg gtg gee cea get tac age ttc aac tge ett 
978 

Val Val Leu Leu Leu Leu Val Ala Pro Ala Tyr Ser Phe Asn Cys Leu 
280 285 290 

gga atg age aae aga gac ttc ttg gaa gga gtg tct gga gea aca tgg 
1026 



I 



Gly Met Ser Asn 
295 

gtg gat ttg gtt 
1074 

Val Asp Leu Val 



gac aag cct acc 
1122 

Asp Lys Pro Thr 
330 

ctg gca gag gtc 
1170 

Leu Ala Glu Val 
345 

tec acc aaa get 
1218 

Ser Thr Lys Ala 
360 

cgt get gac cea 
1266 

Arg Ala Asp Pro 
375 

tg9 ggc aac ggc 
1314 

Trp Gly Asn Gly 



gee aaa ttt gee 
1362 

Ala Lys Phe Ala 
410 

gag aat ate aag 

1410 

Glu Asn lie Lys 
425 

gtg gag teg eae 
1458 

Val Glu Ser His 
440 

999 ttc age 
1506 

Gly Arg Phe Ser 
455 



Arg Asp Phe Leu 
300 

etc gaa ggc gac 

Leu Glu Gly Asp 
315 

ate gat gtg aag 
lie Asp Val Lys 

cgc agt tat tgc 

Arg Ser Tyr Cys 
350 

gcg tgc ccg acc 

Ala Cys Pro Thr 
365 

get ttt gtg tgc 

Ala Phe Val Cys 
380 

tgc gga eta ttt 

Cys Gly Leu Phe 
395 

tgc tet ace aag 
Cys Ser Thr Lys 

tac gaa gtg gee 

Tyr Glu Val Ala 
430 

gga aac tac tec 

Gly Asn Tyr Ser 
445 

ate act cct gcg 

He Thr Pro Ala 
460 



Glu Gly Val Ser 
305 

age tgc gtg act 

Ser Cys Val Thr 
320 

atg atg aat atg 

Met Met Asn Met 
335 

tat ttg get acc 

Tyr Leu Ala Thr 

atg gga gaa get 

Met Gly Glu Ala 
370 

aga caa gga gtg 

Arg Gin Gly Val 
385 

ggc aaa gga age 

Gly Lys Gly Ser 
400 

gca ata gga aga 

Ala He Gly Arg 
415 

att ttt gtc cat 

He Phe Val His 

aca cag gtt gga 

Thr Gin Val Gly 
450 

gcg cct tea tac 

Ala Pro Ser Tyr 
465 



Gly Ala Thr Trp 
310 

ate atg tet aag 

He Met Ser Lys 
325 

gag gcg gee aac 

Glu Ala Ala Asn 
340 

gtc age gat etc 

Val Ser Asp Leu 
355 

eae aat gac aaa 
His Asn Asp Lys 

gtg gac agg ggc 

Val Asp Arg Gly 
390 

att gac aca tgc 

He Asp Thr Cys 
405 

ace ate ttg aaa 

Thr He Leu Lys 
420 

gga cea act act 

Gly Pro Thr Thr 
435 

gee act cag gca 
Ala Thr Gin Ala 

aca eta aag ett 

Thr Leu Lys Leu 
470 



gga gaa tat gga 
1554 

Gly Glu Tyr Gly 



gac acc aat gca 
1602 

Asp Thr Asn Ala 
490 

gtc cat cgt gag 
1650 

Val His Arg Glu 
505 

gga agt act gtg 
1698 

Gly Ser Thr Val 
520 

cca cac gcc acg 
1746 

Pro His Ala Thr 

535 . 

get ctg cat caa 
1794 

Ala Leu His Gin 



aac act gtc aag 
1842 

Asn Thr Val Lys 
570 

gaa aaa ttg cag 
1890 

Glu Lys Leu Gin 
585 

ttc aag ttt ctt 
1938 

Phe Lys Phe Leu 
600 

ttg gaa ttg cag 
1986 

Leu Glu Leu Gin 
615 



gag gtg aca gtg 

Glu Val Thr Val 
475 

tac tac gtg atg 
Tyr Tyr Val Met 

tgg ttc atg gac 

Trp Phe Met Asp 
510 

tgg agg aac aga 

Trp Arg Asn Arg 
525 

aag cag tct gtg 

Lys Gin Ser Val 
540 

get ttg get gga 

Ala Leu Ala Gly 
555 

ttg acg teg ggt 
Leu Thr Ser Gly 

ttg aag gga aca 

Leu Lys Gly Thr 
590 

ggg act ecc gca 

Gly Thr Pro Ala 
605 

tac act ggc acg 

Tyr Thr Gly Thr 
620 



gac tgt gaa cca 

Asp Cys Glu Pro 
480 

act gtt gga aca 

Thr Val Gly Thr 
495 

etc aac etc cct 
Leu Asn Leu Pro 

gag acg tta atg 

Glu Thr Leu Met 
530 

ata gca ttg ggc 

lie Ala Leu Gly 
545 

gcc att cct gtg 

Ala He Pro Val 
560 

cat ttg aag tgt 

His Leu Lys Cys 
575 

ace tat ggc gtc 
Thr Tyr Gly Val 

gac aca ggt cac 

Asp Thr Gly His 
610 

gat gga cct tgc 

Asp Gly Pro Cys 
625 



egg tea ggg att 

Arg Ser Gly lie 
485 

aag acg ttc ttg 

Lys Thr Phe Leu 
500 

tgg age agt get 

Trp Ser Ser Ala 
515 

gag ttt gag gaa 

Glu Phe Glu Glu 

tea caa gag gga 

Ser Gin Glu Gly 
550 

gaa ttt tea age 

Glu Phe Ser Ser 
565 

aga gtg aag atg 

Arg Val Lys Met 
580 

tgt tea aag get 

Cys Ser Lys Ala 
595 

ggc act gtg gtg 
Gly Thr Val Val 

aaa gtt cct ate 

Lys Val Pro He 
630 



teg tea gtg get 
2034 

Ser Ser Val Ala 



act gtc aac cct 

2082 

Thr Val Asn Pro 
650 

att gaa ttg gaa 
2130 

lie Glu Leu Glu 
665 

gga gaa caa cag 
2178 

Gly Glu Gin Gin 
680 

ggc aaa gcc ttt 
2226 

Gly Lys Ala Phe 
695 

eta gga gae aea 

2274 

Leu Gly Asp Thr 



tea gtt ggg aag 
2322 

Ser Val Gly Lys 
730 

ctg tte gga gge 
2370 

Leu Phe Gly Gly 
745 

ctg ttg tgg atg 
2418 

Leu Leu Trp Met 
760 

ttt etc gea gtt 
2466 

Phe Leu Ala Val 
775 

get gae act ggg 
2514 



tea ttg aac gae 

Ser Leu Asn Asp 
635 

ttt gtt tea gtg 
Phe Val Ser Val 

eca ecc ttt gga 

Pro Pro Phe Gly 
670 

ate aat eae eat 

lie Asn His His 
685 

aea ace ace etc 

Thr Thr Thr Leu 
700 

get tgg gae ttt 

Ala Trp Asp Phe 
715 

get gtc eat caa 
Ala Val His Gin 

atg tec tgg ata 

Met Ser Trp lie 
750 

ggc ate aat get 

Gly lie Asn Ala 
765 

gga gga gtt ctg 

Gly Gly Val Leu 
780 

tgt gee ata gae 



eta aeg eca gtg 

Leu Thr Pro Val 
640 

gee aeg gee aac 

Ala Thr Ala Asn 
655 

gae tea tac ata 

Asp Ser Tyr lie 

tgg eae aag tet 

Trp His Lys Ser 
690 

aaa gga geg cag 

Lys Gly Ala Gin 
705 

gga tea gtt gga 

Gly Ser Val Gly 
720 

gtg tte gga gga 

Val Phe Gly Gly 
735 

aeg caa gga ttg 
Thr Gin Gly Leu 

cgt gat agg tee 

Arg Asp Arg Ser 
770 

etc tte etc tec 

Leu Phe Leu Ser 
785 

ate age egg caa 



gge aga ttg gtc 

Gly Arg Leu Val 
645 

get aag gtc ctg 

Ala Lys Val Leu 
660 

gtg gtg ggc aga 

Val Val Gly Arg 
675 

gga age age att 

Gly Ser Ser lie 

aga eta gee get 

Arg Leu Ala Ala 
710 

ggg gtg tte ace 

Gly Val Phe Thr 
725 

gea tte ege tea 

Ala Phe Arg Ser 
740 

ctg ggg get etc 

Leu Gly Ala Leu 
755 

ata get etc aeg 
lie Ala Leu Thr 

gtg aac gtg eae 

Val Asn Val His 
790 

gag ctg aga tgt 



Ala Asp Thr Gly 



gga agt gga gtg 
2562 

Gly Ser Gly Val 
810 

tac aag tat tac 
2610 

Tyr Lys Tyr Tyr 
825 

aaa get cat aag 
2658 

Lys Ala His Lys 
840 

gag cat caa atg 

2706 

Glu His Gin Met 
855 

aag gag aat ggt 
2754 

Lys Glu Asn Gly 



atg tac aag tea 

2802 

Met Tyr Lys Ser 
890 

gaa att ggc tgg 
2850 

Glu lie Gly Trp 
905 

etc gee aac aac 
2898 

Leu Ala Asn Asn 
920 

eeg act eag aat 
2946 

Pro Thr Gin Asn 
935 

ttt ggt etc acc 
2994 

Phe Gly Leu Thr 



Cys Ala lie Asp 
795 

ttc ata cac aat 
Phe lie His Asn 

eet gaa acg eca 

Pro Glu Thr Pro 
830 

gaa gga gtg tgc 

Glu Gly Val Cys 
845 

tgg gaa gca gtg 

Trp Glu Ala Val 
860 

gtg gac ctt agt 

Val Asp Leu Ser 
875 

gca ect aaa cgc 
Ala Pro Lys Arg 

aag gee tgg gga 

Lys Ala Trp Gly 
910 

acc ttt gtg gtt 

Thr Phe Val Val 
925 

cgc get tgg aat 

Arg Ala Trp Asn 
940 

age act egg atg 

Ser Thr Arg Met 
955 



lie Ser Arg Gin 
800 

gat gtg gag get 

Asp Val Glu Ala 
815 

caa ggc eta gee 
Gin Gly Leu Ala 

ggt eta ega tea 

Gly Leu Arg Ser 
850 

aag gac gag etg 

Lys Asp Glu Leu 
865 

gte gtg gtt gag 

Val Val Val Glu 
880 

etc acc gee acc 

Leu Thr Ala Thr 
895 

aag agt att tta 
Lys Ser lie Leu 

gat ggt ccg gag 

Asp Gly Pro Glu 
930 

age tta gaa gtg 

Ser Leu Glu Val 
945 

ttc etg aag gte 

Phe Leu Lys Val 
960 



Glu Leu Arg Cys 
805 

tgg atg gac egg 

Trp Met Asp Arg 
820 

aag ate att cag 

Lys lie lie Gin 
835 

gtt tec aga etg 

Val Ser Arg Leu 

aac act ctt ttg 

Asn Thr Leu Leu 
870 

aaa cag gag gga 

Lys Gin Glu Gly 
885 

acg gaa aaa ttg 

Thr Glu Lys Leu 
900 

ttt gca eca gaa 

Phe Ala Pro Glu 
915 

ace aag gaa tgt 

Thr Lys Glu Cys 

gag gat ttt gga 

Glu Asp Phe Gly 
950 

aga gag age aac 

Arg Glu Ser Asn 
965 



aca act gaa tgt gac teg aag ate att gga acg get gtc aag aac aac 
3042 

Thr Thr Glu Cys Asp Ser Lys lie lie Gly Thr Ala Val Lys Asn Asn 

970 975 980 

ttg gcg ate cac agt gac ctg tec tat tgg att gaa age agg etc aat 
3090 

Leu Ala lie His Ser Asp Leu Ser Tyr Trp lie Glu Ser Arg Leu Asn 
985 990 995 

gat acg tgg aag ett gaa agg gea gtt ctg ggt gaa gtc aaa tea 
3135 

Asp Thr Trp Lys Leu Glu Arg Ala Val Leu Gly Glu Val Lys Ser 
1000 1005 1010 

tgt acg tgg cet gag acg cat ace ttg tgg ggc gat gga ate ett 
3180 

Cys Thr Trp Pro Glu Thr His Thr Leu Trp Gly Asp Gly lie Leu 
1015 1020 1025 

gag agt gac ttg ata ata cea gtc aca ctg gcg gga cea ega age 
3225 

Glu Ser Asp Leu lie lie Pro Val Thr Leu Ala Gly Pro Arg Ser 
1030 1035 1040 

aat cac aat egg aga cet ggg tae aag aca caa aac cag ggc cea 
3270 

Asn His Asn Arg Arg Pro Gly Tyr Lys Thr Gin Asn Gin Gly Pro 
1045 1050 1055 

tgg gac gaa ggc egg gta gag att gac tte gat tae tgc cea gga 
3315 

Trp Asp Glu Gly Arg Val Glu lie Asp Phe Asp Tyr Cys Pro Gly 
1060 1065 1070 

act acg gtc ace ctg agt gag age tgc gga cac cgt gga cet gee 
3360 

Thr Thr Val Thr Leu Ser Glu Ser Cys Gly His Arg Gly Pro Ala 
1075 1080 1085 

act ege ace ace aca gag age gga aag ttg ata aca gat tgg tgc 
3405 

Thr Arg Thr Thr Thr Glu Ser Gly Lys Leu lie Thr Asp Trp Cys 
1090 1095 1100 

tgc agg age tgc ace tta cea cea ctg ege tae caa act gac age 
3450 

Cys Arg Ser Cys Thr Leu Pro Pro Leu Arg Tyr Gin Thr Asp Ser 
1105 1110 1115 



I ( 

ggc tgt tgg tat ggt 
3495 

Gly Cys Trp Tyr Gly 
1120 

aag acc etc gtg cag 

3540 

Lys Thr Leu Val Gin 
1135 

att gac cct ttt cag 
3585 

lie Asp Pro Phe Gin 
1150 

cag gag gtc ctt cgc 

3630 

Gin Glu Val Leu Arg 
1165 

get ata ctg att get 
3675 

Ala lie Leu lie Ala 
1180 

tac act gat gtg tta 
3720 

Tyr Thr Asp Val Leu 
1195 

gea gaa tct aat teg 
3765 

Ala Glu Ser Asn Ser 
1210 

gcg acc ttc aag ata 
3810 

Ala Thr Phe Lys lie 
1225 

aaa gcg aga tgg acc 
3855 

Lys Ala Arg Trp Thr 
1240 

get gtt ttc ttt caa 
3900 

Ala Val Phe Phe Gin 
1255 

etc tgg gag ate cct 
3945 



atg gag ate aga cea 

Met Glu lie Arg Pro 
1125 

tea caa gtg aat get 

Ser Gin Val Asn Ala 
1140 

ttg ggc ctt ctg gtc 

Leu Gly Leu Leu Val 
1155 

aag agg tgg aca gee 

Lys Arg Trp Thr Ala 
1170 

ctg eta gtc ctg gtg 

Leu Leu Val Leu Val 
1185 

cgc tat gtc ate ttg 

Arg Tyr Val lie Leu 
1200 

gga gga gac gtg gta 

Gly Gly Asp Val Val 
1215 

caa cea gtg ttt atg 

Gin Pro Val Phe Met 
1230 

aac cag gag aac att 

Asn Gin Glu Asn lie 
1245 

atg get tat eac gat 

Met Ala Tyr His Asp 
1260 

gat gtg ttg aat tea 



cag aga cat gat gaa 

Gin Arg His Asp Glu 
1130 

tat aat get gat atg 

Tyr Asn Ala Asp Met 
1145 

gtg ttc ttg gee acc 

Val Phe Leu Ala Thr 
1160 

aag ate age atg eca 

Lys lie Ser Met Pro 
1175 

ttt ggg ggc att act 

Phe Gly Gly He Thr 
1190 

gtg ggg gea get ttc 

Val Gly Ala Ala Phe 
1205 

eac ttg gcg etc atg 

His Leu Ala Leu Met 
1220 

gtg gea teg ttt etc 

Val Ala Ser Phe Leu 
1235 

ttg ttg atg ttg gcg 

Leu Leu Met Leu Ala 
1250 

gee cgc caa att ctg 

Ala Arg Gin He Leu 
1265 

ctg gcg gta get tgg 



Leu Trp Glu lie Pro Asp Val Leu Asn Ser Leu Ala Val Ala Trp 
1270 1275 1280 

atg ata ctg aga gcc ata aca ttc aca acg aca tea aac gtg gtt 

3990 

Met lie Leu Arg Ala lie Thr Phe Thr Thr Thr Ser Asn Val Val 
1285 1290 1295 

gtt ccg ctg eta gcc ctg eta aca cec ggg ctg aga tgc ttg aat 
4035 

Val Pro Leu Leu Ala Leu Leu Thr Pro Gly Leu Arg Cys Leu Asn 
1300 1305 1310 

ctg gat gtg tac agg ata ctg ctg ttg atg gtc gga ata ggc age 

4080 

Leu Asp Val Tyr Arg lie Leu Leu Leu Met Val Gly lie Gly Ser 
1315 1320 1325 

ttg ate agg gag aag agg agt gea get gca aaa aag aaa gga gea 
4125 

Leu lie Arg Glu Lys Arg Ser Ala Ala Ala Lys Lys Lys Gly Ala 
1330 1335 1340 

agt ctg eta tgc ttg get eta gcc tea aca gga ctt ttc aac cec 
4170 

Ser Leu Leu Cys Leu Ala Leu Ala Ser Thr Gly Leu Phe Asn Pro 

1345 1350 1355 

atg ate ctt get get gga ctg att gca tgt gat cec aac cgt aaa 

4215 

Met lie Leu Ala Ala Gly Leu lie Ala Cys Asp Pro Asn Arg Lys 

1360 1365 1370 

cgc gga tgg cec gca act gaa gtg atg aca get gtc ggc eta atg 
4260 

Arg Gly Trp Pro Ala Thr Glu Val Met Thr Ala Val Gly Leu Met 
1375 1380 1385 

ttt gee ate gtc gga ggg ctg gea gag ctt gae att gae tee atg 

4305 

Phe Ala lie Val Gly Gly Leu Ala Glu Leu Asp He Asp Ser Met 
1390 1395 1400 

gcc att cea atg act ate geg ggg etc atg ttt get get ttc gtg 
4350 

Ala He Pro Met Thr He Ala Gly Leu Met Phe Ala Ala Phe Val 
1405 1410 1415 

att tct ggg aaa tea aca gat atg tgg att gag aga acg geg gae 
4395 

He Ser Gly Lys Ser Thr Asp Met Trp He Glu Arg Thr Ala Asp 
1420 1425 1430 



att tec tgg gaa agt 
4440 

lie Ser Trp Glu Ser 
1435 

gtt gat gtg egg ctt 
4485 

Val Asp Val Arg Leu 
1450 

gat cca gga gca cct 
4530 

Asp Pro Gly Ala Pro 
1465 

etc gcg att agt gcg 
4575 

Leu Ala lie Ser Ala 
1480 

gtt gga ttt tgg ata 
4620 

Val Gly Phe Trp He 
1495 

ttg tgg gac act ccc 
4665 

Leu Trp Asp Thr Pro 
1510 

acc acc ggc gtc tac 
4710 

Thr Thr Gly Val Tyr 
1525 

tat caa gca gga gcg 
4755 

Tyr Gin Ala Gly Ala 
1540 

ctt tgg cat aca aca 
4800 

Leu Trp His Thr Thr 
1555 

cgc ctg gac cca tac 
4845 

Arg Leu Asp Pro Tyr 
1570 



gat gca gaa att aca 

Asp Ala Glu He Thr 
1440 

gat gat gat gga aac 

Asp Asp Asp Gly Asn 
1455 

tgg aag ata tgg atg 

Trp Lys He Trp Met 
1470 

tac acc ccc tgg gca 

Tyr Thr Pro Trp Ala 
1485 

act etc caa tac aca 

Thr Leu Gin Tyr Thr 
1500 

tea cca aag gag tac 

Ser Pro Lys Glu Tyr 
1515 

agg ate atg act egt 

Arg He Met Thr Arg 
1530 

ggc gtg atg gtt gaa 

Gly Val Met Val Glu 
1545 

aaa gga gee get ttg 

Lys Gly Ala Ala Leu 
1560 

tgg ggc agt gtc aag 

Trp Gly Ser Val Lys 
1575 



ggc teg age gaa aga 

Gly Ser Ser Glu Arg 
1445 

ttc cag etc atg aat 

Phe Gin Leu Met Asn 
1460 

etc aga atg gtc tgt 

Leu Arg Met Val Cys 
1475 

ate ttg ccc tea gta 

He Leu Pro Ser Val 
1490 

aag aga gga ggc gtg 

Lys Arg Gly Gly Val 
1505 

aaa aag ggg gac aeg 

Lys Lys Gly Asp Thr 
1520 

ggg ctg etc ggc agt 

Gly Leu Leu Gly Ser 
1535 

ggt gtt ttc cac acc 

Gly Val Phe His Thr 
1550 

atg age gga gag ggc 

Met Ser Gly Glu Gly 
1565 

gag gat ega ctt tgt 

Glu Asp Arg Leu Cys 
1580 



tac gga gga ccc tgg 
4890 

Tyr Gly Gly Pro Trp 
1585 

gag gtg cag atg att 

4935 

Glu Val Gin Met He 
1600 

gtc cag acg aaa cca 
4980 

Val Gin Thr Lys Pro 
1615 

ggg gcc gtg act ttg 

5025 

Gly Ala Val Thr Leu 
1630 

ata gtg gac aaa aac 
5070 

He Val Asp Lys Asn 
1645 

gtc ata atg ccc aac 

5115 

Val He Met Pro Asn 
1660 

gaa agg atg gat gag 
5160 

Glu Arg Met Asp Glu 
1675 

ctg agg aaa aaa cag 

5205 

Leu Arg Lys Lys Gin 
1690 

ggt aaa aca agg agg 
5250 

Gly Lys Thr Arg Arg 
1705 

aac aga aga ctg aga 
5295 

Asn Arg Arg Leu Arg 
1720 

get get gag atg get 
5340 



aaa ttg cag cac aag 

Lys Leu Gin His Lys 
1590 

gtg gtg gaa cct ggc 

Val Val Glu Pro Gly 
1605 

ggg gtg ttc aaa aca 

Gly Val Phe Lys Thr 
1620 

gac ttc ccc act gga 

Asp Phe Pro Thr Gly 
1635 

ggt gat gtg att ggg 

Gly Asp Val He Gly 
1650 

ggc tea tac ata age 

Gly Ser Tyr He Ser 
1665 

cca ate cca gee gga 

Pro He Pro Ala Gly 
1680 

ate act gta ctg gat 

He Thr Val Leu Asp 
1695 

att ctg cca cag ate 

He Leu Pro Gin He 
1710 

aca gee gtg eta gea 

Thr Ala Val Leu Ala 
1725 

gaa gea ctg aga gga 



tgg aac ggg cag gat 

Trp Asn Gly Gin Asp 
1595 

aag aac gtt aag aac 

Lys Asn Val Lys Asn 
1610 

cct gaa gga gaa ate 

Pro Glu Gly Glu He 
1625 

aca tea ggc tea cca 

Thr Ser Gly Ser Pro 
1640 

ett tat ggc aat gga 

Leu Tyr Gly Asn Gly 
1655 

gcg ata gtg cag ggt 

Ala He Val Gin Gly 
1670 

ttc gaa cct gag atg 

Phe Glu Pro Glu Met 
1685 

etc cat ccc ggc gee 

Leu His Pro Gly Ala 
1700 

ate aaa gag gee ata 

He Lys Glu Ala He 
1715 

cca ace agg gtt gtg 

Pro Thr Arg Val Val 
1730 

ctg ccc ate egg tac 



Ala Ala Glu Met Ala Glu Ala Leu Arg Gly Leu Pro lie Arg Tyr 
1735 1740 1745 

cag aca tec gca gtg ccc aga gaa cat aat gga aat gag att gtt 
5385 

Gin Thr Ser Ala Val Pro Arg Glu His Asn Gly Asn Glu lie Val 
1750 1755 1760 

gat gtc atg tgt cat get acc etc ace eac agg ctg atg tet ect 
5430 

Asp Val Met Cys His Ala Thr Leu Thr His Arg Leu Met Ser Pro 
1765 1770 1775 

cac agg gtg ceg aae tac aac ctg ttc gtg atg gat gag get cat 

5475 

His Arg Val Pro Asn Tyr Asn Leu Phe Val Met Asp Glu Ala His 
1780 1785 1790 

ttc acc gae cea get age att gca gca aga ggt tac att tec aca 

5520 

Phe Thr Asp Pro Ala Ser lie Ala Ala Arg Gly Tyr lie Ser Thr 
1795 1800 1805 

aag gtc gag eta ggg gag gcg gcg gca ata ttc atg aca gee acc 
5565 

Lys Val Glu Leu Gly Glu Ala Ala Ala He Phe Met Thr Ala Thr 
1810 1815 1820 

eca cea ggc act tea gat cea ttc cea gag tec aat tea cea att 
5610 

Pro Pro Gly Thr Ser Asp Pro Phe Pro Glu Ser Asn Ser Pro He 
1825 1830 1835 

tec gae tta cag act gag ate ceg gat cga get tgg aac tet gga 
5655 

Ser Asp Leu Gin Thr Glu He Pro Asp Arg Ala Trp Asn Ser Gly 
1840 1845 1850 

tac gaa tgg ate aca gaa tac acc ggg aag aeg gtt tgg ttt gtg 

5700 

Tyr Glu Trp He Thr Glu Tyr Thr Gly Lys Thr Val Trp Phe Val 
1855 1860 1865 

cet agt gtc aag atg ggg aat gag att gee ett tge eta caa cgt 
5745 

Pro Ser Val Lys Met Gly Asn Glu He Ala Leu Cys Leu Gin Arg 
1870 1875 1880 

get gga aag aaa gta gtc caa ttg aae aga aag teg tac gag aeg 
5790 

Ala Gly Lys Lys Val Val Gin Leu Asn Arg Lys Ser Tyr Glu Thr 
1885 1890 1895 



gag tac cca aaa tgt 
5835 

Glu Tyr Pro Lys Cys 
1900 

aca gac ata tct gaa 
5880 

Thr Asp lie Ser Glu 
1915 

att gac age egg aag 
5925 

lie Asp Ser Arg Lys 
1930 

gaa ggg aga gtg ate 
5970 

Glu Gly Arg Val He 
1945 

agt gcc gee cag aga 
6015 

Ser Ala Ala Gin Arg 
1960 

gtt ggt gat gag tac 
6060 

Val Gly Asp Glu Tyr 
1975 

teg aac ttc gcc cat 

6105 

Ser Asn Phe Ala His 
1990 

ate aac atg cca aac 
6150 

He Asn Met Pro Asn 
2005 

cgt gag aag gta tat 

6195 

Arg Glu Lys Val Tyr 
2020 

gaa gag aga aaa aac 
6240 

Glu Glu Arg Lys Asn 
2035 



aag aac gat gat tgg 

Lys Asn Asp Asp Trp 
1905 

atg ggg get aac ttc 

Met Gly Ala Asn Phe 
1920 

agt gtg aaa cca ace 

Ser Val Lys Pro Thr 
1935 

ctg gga gaa cca tct 

Leu Gly Glu Pro Ser 
1950 

cgt gga cgt ate ggt 

Arg Gly Arg He Gly 
1965 

tgt tat ggg ggg cac 

Cys Tyr Gly Gly His 
1980 

tgg act gag gea ega 

Trp Thr Glu Ala Arg 
1995 

gga ctg ate get caa 

Gly Leu He Ala Gin 
2010 

ace atg gat ggg gaa 

Thr Met Asp Gly Glu 
2025 

ttt ctg gaa ctg ttg 

Phe Leu Glu Leu Leu 
2040 



gac ttt gtt ate aca 

Asp Phe Val He Thr 
1910 

aag gcg age agg gtg 

Lys Ala Ser Arg Val 
1925 

ate ata aca gaa gga 

He He Thr Glu Gly 
1940 

gea gtg aca gea get 

Ala Val Thr Ala Ala 
1955 

aga aat ccg teg caa 

Arg Asn Pro Ser Gin 
1970 

aeg aat gaa gac gac 

Thr Asn Glu Asp Asp 
1985 

ate atg ctg gac aac 

He Met Leu Asp Asn 
2000 

ttc tac caa cca gag 

Phe Tyr Gin Pro Glu 
2015 

tac egg etc aga gga 

Tyr Arg Leu Arg Gly 
2030 

agg act gea gat ctg 

Arg Thr Ala Asp Leu 
2045 



cca gtt tgg ctg get 
6285 

Pro Val Trp Leu Ala 
2050 

cac gac egg agg tgg 

6330 

His Asp Arg Arg Trp 
2065 

tta gaa gac aac aac 
6375 

Leu Glu Asp Asn Asn 
2080 

agg aag att ctg agg 
6420 

Arg Lys lie Leu Arg 
2095 

gat cac cag gca eta 
6465 

Asp His Gin Ala Leu 
2110 

cgt tct cag ata ggg 
6510 

Arg Ser Gin lie Gly 
2125 

cac ttc atg ggg aag 
6555 

His Phe Met Gly Lys 
2140 

gtg gee act gca gag 
6600 

Val Ala Thr Ala Glu 
2155 

gag gaa ctg cca gat 
6645 

Glu Glu Leu Pro Asp 
2170 

ttg agt gtg atg acc 
6690 

Leu Ser Val Met Thr 
2185 

^^9 g9C att gga aag 
6735 



tac aag gtt gca geg 

Tyr Lys Val Ala Ala 
2055 

tgc ttt gat ggt cet 

Cys Phe Asp Gly Pro 
2070 

gaa gtg gaa gtc ate 

Glu Val Glu Val He 
2085 

ccg cgc tgg att gac 

Pro Arg Trp He Asp 
2100 

aag geg ttc aag gac 

Lys Ala Phe Lys Asp 
2115 

etc att gag gtt ctg 

Leu He Glu Val Leu 
2130 

aca tgg gaa gca ctt 

Thr Trp Glu Ala Leu 
2145 

aaa gga gga aga get 

Lys Gly Gly Arg Ala 
2160 

get ctt cag aca att 

Ala Leu Gin Thr He 
2175 

atg gga gta ttc ttc 

Met Gly Val Phe Phe 
2190 

ata ggt ttg gga ggc 



get gga gtg tea tac 

Ala Gly Val Ser Tyr 
2060 

agg aca aac aca att 

Arg Thr Asn Thr He 
2075 

aeg aag ctt ggt gaa 

Thr Lys Leu Gly Glu 
2090 

gee agg gtg tac teg 

Ala Arg Val Tyr Ser 
2105 

ttc gee teg gga aaa 

Phe Ala Ser Gly Lys 
2120 

gga aag atg cet gag 

Gly Lys Met Pro Glu 
2135 

gac ace atg tac gtt 

Asp Thr Met Tyr Val 
2150 

cac aga atg gee ctg 

His Arg Met Ala Leu 
2165 

gee ttg att gee tta 

Ala Leu He Ala Leu 
2180 

etc etc atg cag egg 

Leu Leu Met Gin Arg 
2195 

get gtc ttg gga gtc 



Lys Gly He Gly Lys He Gly Leu Gly Gly Ala Val Leu Gly Val 
2200 2205 2210 

gcg acc ttt ttc tgt tgg atg get gaa gtt cca gga acg aag ate 
6780 

Ala Thr Phe Phe Cys Trp Met Ala Glu Val Pro Gly Thr Lys He 
2215 2220 2225 

gcc gga atg ttg ctg etc tee ett ete ttg atg att gtg eta att 
6825 

Ala Gly Met Leu Leu Leu Ser Leu Leu Leu Met He Val Leu He 
2230 2235 2240 

ect gag eea gag aag eaa egt teg eag aca gae aac eag eta gee 
6870 

Pro Glu Pro Glu Lys Gin Arg Ser Gin Thr Asp Asn Gin Leu Ala 
2245 2250 2255 

gtg ttc ctg att tgt gtc atg ace ett gtg age gca gtg gea gee 

6915 

Val Phe Leu He Cys Val Met Thr Leu Val Ser Ala Val Ala Ala 
2260 2265 2270 

aac gag atg ggt tgg eta gat aag ace aag agt gae ata age agt 
6960 

Asn Glu Met Gly Trp Leu Asp Lys Thr Lys Ser Asp He Ser Ser 
2275 2280 2285 

ttg ttt ggg eaa aga att gag gtc aag gag aat ttc age atg gga 
7005 

Leu Phe Gly Gin Arg He Glu Val Lys Glu Asn Phe Ser Met Gly 
2290 2295 2300 

gag ttt ett ttg gae ttg agg ecg gca aca gcc tgg tea ctg tae 
7050 

Glu Phe Leu Leu Asp Leu Arg Pro Ala Thr Ala Trp Ser Leu Tyr 
2305 2310 2315 

get gtg aca aca gcg gtc etc act cca ctg eta aag eat ttg ate 
7095 

Ala Val Thr Thr Ala Val Leu Thr Pro Leu Leu Lys His Leu He 
2320 2325 2330 

acg tea gat tae ate aac ace tea ttg acc tea ata aac gtt eag 
7140 

Thr Ser Asp Tyr He Asn Thr Ser Leu Thr Ser He Asn Val Gin 
2335 2340 2345 

gca agt gca eta ttc aca etc gcg ega gge ttc ecc ttc gtc gat 
7185 

Ala Ser Ala Leu Phe Thr Leu Ala Arg Gly Phe Pro Phe Val Asp 
2350 2355 2360 



gtt gga gtg teg get 
7230 

Val Gly Val Ser Ala 
2365 

gtc acc etc ace gtt 
7275 

Val Thr Leu Thr Val 
2380 

cac tat gcc tac atg 
7320 

His Tyr Ala Tyr Met 
2395 

tea gee cag egg egg 
7365 

Ser Ala Gin Arg Arg 
2410 

gtg gat gge ate gtg 
7410 

Val Asp Gly He Val 
2425 

aca cce ate atg eag 
7455 

Thr Pro He Met Gin 
2440 

gtg tet eta get gea 

7500 

Val Ser Leu Ala Ala 
2455 

cga gaa gee gga att 
7545 

Arg Glu Ala Gly He 
2470 

gag aat gga gea age 

7590 

Glu Asn Gly Ala Ser 
2485 

ete tge eae ate atg 
7635 

Leu Cys His lie Met 
2500 



ete etg eta gea gcc 

Leu Leu Leu Ala Ala 
2370 

aeg gta aca geg gea 

Thr Val Thr Ala Ala 
2385 

gtt ccc ggt tgg caa 

Val Pro Gly Trp Gin 
2400 

aca geg gee gga ate 

Thr Ala Ala Gly He 
2415 

gcc aeg gae gtc eea 

Ala Thr Asp Val Pro 

2430 

aag aaa gtt gga eag 

Lys Lys Val Gly Gin 
2445 

gta gta gtg aae ecg 

Val Val Val Asn Pro 
2460 

ttg ate aeg gee gea 

Leu He Thr Ala Ala 
2475 

tet gtt tgg aae gea 

Ser Val Trp Asn Ala 
2490 

cgt ggg ggt tgg ttg 

Arg Gly Gly Trp Leu 
2505 



99^ tgc tgg gga caa 

Gly Cys Trp Gly Gin 
2375 

aca ete ett ttt tge 

Thr Leu Leu Phe Cys 
2390 

get gag gea atg cgc 

Ala Glu Ala Met Arg 
2405 

atg aag aae get gta 

Met Lys Asn Ala Val 
2420 

gaa tta gag cgc ace 

Glu Leu Glu Arg Thr 

2435 

ate atg etg ate ttg 

He Met Leu He Leu 
2450 

tet gtg aag aea gta 

Ser Val Lys Thr Val 
2465 

geg gtg aeg ett tgg 

Ala Val Thr Leu Trp 
2480 

aea act gee ate gga 

Thr Thr Ala He Gly 
2495 

tea tgt eta tee ata 

Ser Cys Leu Ser He 
2510 



aca tgg aca etc ata 
7680 

Thr Trp Thr Leu lie 
2515 

ggt ggg gca aaa gga 

7725 

Gly Gly Ala Lys Gly 
2530 

etc aac cag atg aca 
7770 

Leu Asn Gin Met Thr 
2545 

gcc ate ate gaa gte 

7815 

Ala He He Glu Val 
2560 

gaa gge aat gte act 
7860 

Glu Gly Asn Val Thr 
2575 

aaa ctg aga tgg ctg 

7905 

Lys Leu Arg Trp Leu 
2590 

aaa gtg att gae ett 
7950 

Lys Val He Asp Leu 
2605 

atg gca ace caa aaa 
7995 

Met Ala Thr Gin Lys 
2620 

ggc ggt ecc gga cat 
8040 

Gly Gly Pro Gly His 
2635 

tgg aac att gte ace 
8085 

Trp Asn He Val Thr 
2650 

cet tct gag tgt tgt 
8130 



aag aac atg gaa aaa 

Lys Asn Met Glu Lys 
2520 

egc ace ttg gga gag 

Arg Thr Leu Gly Glu 
2535 

aaa gaa gag ttc act 

Lys Glu Glu Phe Thr 
2550 

gat egc tea gcg gca 

Asp Arg Ser Ala Ala 
2565 

gga ggg cat cca gte 

Gly Gly His Pro Val 
2580 

gte gaa egg agg ttt 

Val Glu Arg Arg Phe 
2595 

gga tgt gga aga ggc 

Gly Cys Gly Arg Gly 
2610 

aga gte caa gaa gte 

Arg Val Gin Glu Val 
2625 

gaa gag ccc caa eta 

Glu Glu Pro Gin Leu 
2640 

atg aag agt gga gtg 

Met Lys Ser Gly Val 
2655 

gae acc etc ctt tgt 



cca gga eta aaa aga 

Pro Gly Leu Lys Arg 
2525 

gtt tgg aaa gaa aga 

Val Trp Lys Glu Arg 
2540 

agg tac egc aaa gag 

Arg Tyr Arg Lys Glu 
2555 

aaa cac gee agg aaa 

Lys His Ala Arg Lys 
2570 

tct agg ggc aca gca 

Ser Arg Gly Thr Ala 
2585 

etc gaa eeg gte gga 

Leu Glu Pro Val Gly 
2600 

ggt tgg tgt tac tat 

Gly Trp Cys Tyr Tyr 
2615 

aga ggg tac aca aag 

Arg Gly Tyr Thr Lys 
2630 

gtg caa agt tat gga 

Val Gin Ser Tyr Gly 
2645 

gat gtg ttc tac aga 

Asp Val Phe Tyr Arg 
2660 

gae ate gga gag tec 



Pro Ser Glu Cys Cys Asp Thr Leu Leu Cys Asp lie Gly Glu Ser 
2665 2670 2675 

teg tea agt get gag gtt gaa gag cat agg acg att egg gte ett 
8175 

Ser Ser Ser Ala Glu Val Glu Glu His Arg Thr lie Arg Val Leu 
2680 2685 2690 

gaa atg gtt gag gae tgg ctg cae ega ggg eea agg gaa ttt tge 
8220 

Glu Met Val Glu Asp Trp Leu His Arg Gly Pro Arg Glu Phe Cys 
2695 2700 2705 

gtg aag gtg ete tge ecc tac atg ceg aaa gte ata gag aag atg 

8265 

Val Lys Val Leu Cys Pro Tyr Met Pro Lys Val lie Glu Lys Met 
2710 2715 2720 

gag etg ete eaa ege egg tat ggg ggg gga etg gte aga aae eea 
8310 

Glu Leu Leu Gin Arg Arg Tyr Gly Gly Gly Leu Val Arg Asn Pro 
2725 2730 2735 

etc tea egg aat tee aeg eac gag atg tat tgg gtg agt ega get 

8355 

Leu Ser Arg Asn Ser Thr His Glu Met Tyr Trp Val Ser Arg Ala 
2740 2745 2750 

tea gge aat gtg gta eat tea gtg aat atg aee age eag gtg ete 

8400 

Ser Gly Asn Val Val His Ser Val Asn Met Thr Ser Gin Val Leu 
2755 2760 2765 

eta gga aga atg gaa aaa agg ace tgg aag gga ecc eaa tac gag 
8445 

Leu Gly Arg Met Glu Lys Arg Thr Trp Lys Gly Pro Gin Tyr Glu 
2770 2775 2780 

gaa gat gta aae ttg gga agt gga ace agg geg gtg gga aaa eee 

8490 

Glu Asp Val Asn Leu Gly Ser Gly Thr Arg Ala Val Gly Lys Pro 
2785 2790 2795 

etg ete aae tea gae aee agt aaa ate aag aae agg att gaa ega 
8535 

Leu Leu Asn Ser Asp Thr Ser Lys lie Lys Asn Arg lie Glu Arg 
2800 2805 2810 

etc agg egt gag tac agt teg aeg tgg eac cae gat gag aae cae 
8580 

Leu Arg Arg Glu Tyr Ser Ser Thr Trp His His Asp Glu Asn His 

2815 2820 2825 



cca tat aga acc tgg 
8625 

Pro Tyr Arg Thr Trp 
2830 

aca ggc tec gcc agt 
8670 

Thr Gly Ser Ala Ser 
2845 

tea aaa cca tgg gac 
8715 

Ser Lys Pro Trp Asp 
2860 

act gac act act ccc 
8760 

Thr Asp Thr Thr Pro 
2875 

gtg gac acg aaa get 
8805 

Val Asp Thr Lys Ala 
2890 

etc aat gag acc acc 
8850 

Leu Asn Glu Thr Thr 
2905 

aaa cgt ccc aga atg 

8895 

Lys Arg Pro Arg Met 
2920 

aac age aat gca get 
8940 

Asn Ser Asn Ala Ala 
2935 

tgg agg age gcc aga 
8985 

Trp Arg Ser Ala Arg 
2950 

atg gtg gat gag gag 
9030 

Met Val Asp Glu Glu 
2965 



aac tat cae ggc agt 

Asn Tyr His Gly Ser 
2835 

teg ctg gtc aat gga 

Ser Leu Val Asn Gly 
2850 

acc ate acg aat gtt 

Thr He Thr Asn Val 
2865 

ttc ggg cag cag cga 

Phe Gly Gin Gin Arg 
2880 

cet gaa eeg cea gaa 

Pro Glu Pro Pro Glu 
2895 

aac tgg ttg tgg gcg 

Asn Trp Leu Trp Ala 
2910 

tgc tct cga gag gaa 

Cys Ser Arg Glu Glu 
2925 

ttg ggt gcc atg ttt 

Leu Gly Ala Met Phe 
2940 

gaa gca gtt gaa gat 

Glu Ala Val Glu Asp 
2955 

cge gag gca cat ctg 

Arg Glu Ala His Leu 
2970 



tat gat gtg aag ccc 

Tyr Asp Val Lys Pro 
2840 

gtg gtc agg etc etc 

Val Val Arg Leu Leu 
2855 

acc ace atg gcc atg 

Thr Thr Met Ala Met 
2870 

gtg ttc aaa gag aag 

Val Phe Lys Glu Lys 
2885 

gga gtg aag tac gtg 

Gly Val Lys Tyr Val 
2900 

ttt ttg gcc aga gaa 

Phe Leu Ala Arg Glu 
2915 

ttc ata aga aag gtc 

Phe He Arg Lys Val 
2930 

gaa gag cag aat caa 

Glu Glu Gin Asn Gin 
2945 

cca aaa ttt tgg gag 

Pro Lys Phe Trp Glu 
2960 

egg ggg gaa tgt cae 

Arg Gly Glu Cys His 
2975 



act tgc att tac aac 
9075 

Thr Cys lie Tyr Asn 
2980 

gag ttc gga aag gcc 

9120 

Glu Phe Gly Lys Ala 
2995 

etc gga get ege ttt 
9165 

Leu Gly Ala Arg Phe 
3010 

gaa gae eae tgg ctt 

9210 

Glu Asp His Trp Leu 
3025 

gge ttg gge etc eaa 
9255 

Gly Leu Gly Leu Gin 
3040 

aee egg eet ggg gge 

9300 

Thr Arg Pro Gly Gly 
3055 

gae aee ege ate aeg 
9345 

Asp Thr Arg lie Thr 
3070 

ctt gag ctg ctt gat 
9390 

Leu Glu Leu Leu Asp 
3085 

att gag etc ace tat 
9435 

lie Glu Leu Thr Tyr 
3100 

get get gat gga aga 
9480 

Ala Ala Asp Gly Arg 
3115 

cag agg ggg agt gga 
9525 



atg atg gga aag aga 

Met Met Gly Lys Arg 
2985 

aag gga age aga gee 

Lys Gly Ser Arg Ala 
3000 

etg gag ttc gag get 

Leu Glu Phe Glu Ala 
3015 

gga aga aag aac tea 

Gly Arg Lys Asn Ser 
3030 

aaa ctg ggt tac ate 

Lys Leu Gly Tyr lie 
3045 

aag ate tat get gat 

Lys lie Tyr Ala Asp 
3060 

aga get gae ttg gaa 

Arg Ala Asp Leu Glu 
3075 

ggg gaa cat egg cgt 

Gly Glu His Arg Arg 
3090 

cgt eae aaa gtt gtg 

Arg His Lys Val Val 
3105 

aee gtc atg gat gtt 

Thr Val Met Asp Val 
3120 

eaa gtt gtc ace tac 



gag aaa aaa cec gga 

Glu Lys Lys Pro Gly 
2990 

att tgg ttc atg tgg 

lie Trp Phe Met Trp 
3005 

etg ggt ttt etc aat 

Leu Gly Phe Leu Asn 
3020 

gga gga ggt gtc gag 

Gly Gly Gly Val Glu 
3035 

ctg cgt gaa gtt gge 

Leu Arg Glu Val Gly 
3050 

gae aca get gge tgg 

Asp Thr Ala Gly Trp 
3065 

aat gaa . get aag gtg 

Asn Glu Ala Lys Val 
3080 

ctt gcc agg gee ate 

Leu Ala Arg Ala lie 
3095 

aaa gtg atg ege ecg 

Lys Val Met Arg Pro 
3110 

ate tec aga gaa gat 

lie Ser Arg Glu Asp 
3125 

gee eta aac act ttc 



Gin Arg Gly Ser Gly Gin Val Val Thr Tyr Ala Leu Asn Thr Phe 
3130 3135 3140 

acc aac ctg gcc gtc cag ctg gtg agg atg atg gaa ggg gaa gga 
9570 

Thr Asn Leu Ala Val Gin Leu Val Arg Met Met Glu Gly Glu Gly 
3145 3150 3155 

gtg att ggc cca gat gat gtg gag aaa etc aca aaa ggg aaa gga 
9615 

Val lie Gly Pro Asp Asp Val Glu Lys Leu Thr Lys Gly Lys Gly 
3160 3165 3170 

ccc aaa gtc agg acc tgg ctg ttt gag aat ggg gaa gaa aga etc 
9660 

Pro Lys Val Arg Thr Trp Leu Phe Glu Asn Gly Glu Glu Arg Leu 
3175 3180 3185 

age cgc atg get gtc agt gga gat gae tgt gtg gta aag ccc ctg 

9705 

Ser Arg Met Ala Val Ser Gly Asp Asp Cys Val Val Lys Pro Leu 
3190 3195 3200 

gae gat cgc ttt gcc acc teg etc cac ttc etc aat get atg tea 
9750 

Asp Asp Arg Phe Ala Thr Ser Leu His Phe Leu Asn Ala Met Ser 
3205 3210 3215 

aag gtt cgc aaa gae ate caa gag tgg aaa ccg tea act gga tgg 
9795 

Lys Val Arg Lys Asp lie Gin Glu Trp Lys Pro Ser Thr Gly Trp 
3220 3225 3230 

tat gat tgg cag cag gtt cca ttt tgc tea aac eat ttc act gaa 
9840 

Tyr Asp Trp Gin Gin Val Pro Phe Cys Ser Asn His Phe Thr Glu 
3235 3240 3245 

ttg ate atg aaa gat gga aga aca ctg gtg gtt cca tgc ega gga 

9885 

Leu lie Met Lys Asp Gly Arg Thr Leu Val Val Pro Cys Arg Gly 
3250 3255 3260 

cag gat gaa ttg gta ggc aga get cgc ata tet cca ggg gcc gga 
9930 

Gin Asp Glu Leu Val Gly Arg Ala Arg lie Ser Pro Gly Ala Gly 
3265 3270 3275 

tgg aac gtc cgc gae act get tgt ctg get aag tet tat gee cag 
9975 

Trp Asn Val Arg Asp Thr Ala Cys Leu Ala Lys Ser Tyr Ala Gin 
3280 3285 3290 



atg tgg ctg ctt ctg tac ttc cac aga aga gac ctg egg etc atg 
10020 

Met Trp Leu Leu Leu Tyr Phe His Arg Arg Asp Leu Arg Leu Met 
3295 3300 3305 

gcc aac gcc att tgc tec get gtc cet gtg aat tgg gte eet ace 
10065 

Ala Asn Ala lie Cys Ser Ala Val Pro Val Asn Trp Val Pro Thr 
3310 3315 3320 

gga aga ace acg tgg tec ate eat gca gga gga gag tgg atg aca 
10110 

Gly Arg Thr Thr Trp Ser lie His Ala Gly Gly Glu Trp Met Thr 
3325 3330 3335 

aca gag gac atg ttg gag gte tgg aac cgt gtt tgg ata gag gag 
10155 

Thr Glu Asp Met Leu Glu Val Trp Asn Arg Val Trp lie Glu Glu 
3340 3345 3350 

aat gaa tgg atg gaa gac aaa ace cea gtg gag aaa tgg agt gac 
10200 

Asn Glu Trp Met Glu Asp Lys Thr Pro Val Glu Lys Trp Ser Asp 
3355 3360 3365 

gtc cea tat tea gga aaa ega gag gac ate tgg tgt gge age ctg 
10245 

Val Pro Tyr Ser Gly Lys Arg Glu Asp lie Trp Cys Gly Ser Leu 
3370 3375 3380 

att gge aca aga gee ega gee acg tgg gca gaa aac ate cag gtg 

10290 

He Gly Thr Arg Ala Arg Ala Thr Trp Ala Glu Asn He Gin Val 
3385 3390 3395 

get ate aac caa gtc aga gca ate ate gga gat gag aag tat gtg 
10335 

Ala He Asn Gin Val Arg Ala lie He Gly Asp Glu Lys Tyr Val 
3400 3405 3410 

gat tac atg agt tea eta aag aga tat gaa gac aca act ttg gtt 

10380 

Asp Tyr Met Ser Ser Leu Lys Arg Tyr Glu Asp Thr Thr Leu Val 

3415 3420 3425 

gag gac aca gta ctg tag atatttaate aattgtaaat agacaatata 
10428 

Glu Asp Thr Val Leu 
3430 



agtatgcata aaagtgtagt tttatagtag 
10488 

aaaattttga ggagaaagtc aggccgggaa 
10548 

gctgcctgcg actcaacccc aggaggactg 
10608 

agccctcaga accgtctcgg aaggaggacc 
10668 

agaccacgct acggcgtgct actctgcgga 
10728 

actgggttaa caaaggcaaa ccaacgcccc 
10788 

ccagggcgaa aggactagag gttagaggag 
10848 

tgactgaagc tgtaggtcag gggaaggact 
10908 

caccacaaca aaacagcata ttgacacctg 
10968 

aaccagccac acggcacagt gcgccgacaa 
11028 

t 

11029 



tatttagtgg tgttagtgta aatagttaag 
gttcccgcca ccggaagttg agtagacggt 
ggtgaacaaa gccgcgaagt gatccatgta 
ccacatgttg taacttcaaa gcccaatgtc 
gagtgcagtc tgcgatagtg ccccaggagg 
acgcggccct agccccggta atggtgttaa 
accccgcggt ttaaagtgca cggcccagcc 
agaggttagt ggagaccccg tgccacaaaa 
ggatagacta ggagatcttc tgctctgcac 
tggtggctgg tggtgcgaga acacaggatc 



Figure 13. 5kb C5 locus and PGR primers to amplify C5 arms (SEQ ID NO: 77) 



CTGAAATTGTAATTTCTACATGTAGAGAAGGTTTTGATATTGATGGTTTTAACAGAAACG 
GACTTTAACATTAAAGATGTACATCTCTTCCAAAACTATAACTACCAAAATTGTCTTTGC- 
10 20 30 40 50 60 



TAGAAATTATATCAAGGGATAACATTTTATATGATATAGTTTTAAAGTGTAAGATGGAAT 
ATCTTTAATATAGTTCCCTATTGTAAAATATACTATATCAAAATTTCACATTCTACCTTA 
70 80 90 100 no 120 



TAAATTTCATGTGCACAAGAGGCATAGGAGATAAAAGCATTTTCAGACTTTGTATAATGA 
ATTTAAAGTACACGTGTTCTCCGTATCCTCTATTTTCGTAAAAGTCTGAAACATATTACT 
130 140 150 160 170 180 



AGGAATATGATCAAATAAACAAGAATCTGTTAGTTAGTTACTTGGATAAATTAATCGAGA 
TCCTTATACTAGTTTATTTGTTCTTAGACAATCAATCAATGAACCTATTTAATTAGCTCT 
190 200 210 220 230 240 



CGCGTGATAAAATGACTATGTACCGTTATTGCATGAACGATATTATAAATATAGGTTCTC 
GCGCACTATTTTACTGATACATGGCAATAACGTACTTGCTATAATATTTATATCCAAGAG 
250 260 270 280 290 300 



(C5A1) GGCCGAATTC 

GTAGGAGAGAACTATT GACTATGGC a\ t GAATGTTAAATGTTATACTTT GGATGAAGCTA 
CATCCTCTCTTGATAACTGATACCGTTACTTACAATTTACAATATGAAACCTACTTCGAT 
310 320 330 340 350 360 



TAAATATGCATTGGAAAAATAATCCATTTAAAGAAAGGATTCAAATACTACAAAACCTAA 
ATTTATACGTAACCTTTTTATTAGGTAAATTTCTTTCCTAAGTTTATGATGTTTTGGATT 
370 380 390 400 410 420 



GCGATAATATGTTAACTAAGCTTATTCTTAACGACGCTTTAAATATACACAAATAAACAT 
CGCTATTATACAATTGATTCGAATAAGAATTGCTGCGAAATTTATATGTGTTTATTTGTA 
430 440 450 460 470 480 



AATTTTTGTATAACCTAACAAATAACTAAAACATAAAAATAATAAAAGGAAATGTAATAT 
TTAAAAACATATTGGATTGTTTATTGATTTTGTATTTTTATTATTTTCCTTTACATTATA 
490 500 510 520 530 540 



CGTAATTATTTTACTCAGGAATGGG 
GCATTAATAAAATGAGTCCTTACCC 
550 560 



GTTAAATATTTATATCACGTGTATATCTATAC 
CAATTTATAAATATAGTGCACATATAGATATG 
570 580 590 



T G T 
AC A 
600 



TATCGTATACTCTTTACAATTACTATTACGAATATGCAAGAGATAATAAGATTACGTATT 
ATAGCATATGAGAAATGTTAATGATAATGCTTATACGTTCTCTATTATTCTAATGCATAA 
610 620 630 640 650 660 



TAAGAGAATCTTGTCATGATAATTGGGTACGACATAGTGATAAATGCTATTTCGCATCGT 
ATTCTCTTAGAACAGTACTATTAACCCATGCTGTATCACTATTTACGATAAAGCGTAGCA 
670 680 690 700 710 720 



TACATAAAGTCAGTT6GAAAGATGGATTTGACAGATGTAACTTAATAGGTGCAAAAATGT 
ATGTATTTCAGTCAACCTTTCTACCTAAACTGTCTACATTGAATTATCCACGTTTTTACA 
730 740 750 760 770 780 



TAAATAACAGCATTCTATCGGAAGATAGGATACCAGTTATATTATACAAAAATCACTGGT 
ATTTATTGTCGTAAGATAGCCTTCTATCCTATGGTCAATATAATATGTTTTTAGTGACCA 
790 800 810 820 830 840 



TGGATAAAACAGATTCTGCAATATTCGTAAAAGATGAAGATTACTGCGAATTTGTAAACT 
ACCTATTTTGTCTAAGACGTTATAAGCATTTTCTACTTCTAATGACGCTTAAACATTTGA 
850 860 870 880 890 900 



ATGACAATAAAAAGCCATTTATCTCAACGACATCGTGTAATTCTTCCATGTTTTATGTAT 
TACTGTTATTTTTCGGTAAATAGAGTTGCTGTAGCACATTAAGAAGGTACAAAATACATA 
910 920 930 940 950 960 



6TGTTTCAGATATTATGAGATTACTATAAACTTTTTGTATACTTATATTCCGTAAACTAT 
CACAAAGTCTATAATACTCTAATGATATTTGAAAAACATATGAATATAAGGCATTTGATA 
970 980 990 1000 1010 1020 



ATTAATCATGAAGAAAATGAAAAAGTATAGAAGCTGTTCACGAGCGGTTGTTGAAAACAA 
TAATTAGTACTTCTTTTACTTTTTCATATCTTCGACAAGTGCTCGCCAACAACTTTTGTT 
1030 1040 1050 1060 1070 1080 



CAAAATTATACATTCAAGATGGCTTACATATACGTCTGTGAGGCTATCATGGATAATGAC 
GTTTTAATATGTAAGTTCTACCGAATGTATATGCAGACACTCCGATAGTACCTATTACTG 
1090 1100 1110 1120 1130 1140 



AATGCATCTCTAAATAGGTTTTTGGACAATGGATTCGACCCTAACACGGAATATGGTACT 
TTACGTAGAGATTTATCCAAAAACCTGTTACCTAAGCTGGGATTGTGCCTTATACCATGA 
1150 1160 1170 1180 1190 1200 



CTACAATCTCCTCTTGAAATGGCTGTAATGTTCAAGAATACCGAGGCTATAAAAATCTTG 
GATGTTAGAGGAGAACTTTACCGACATTACAAGTTCTTATGGCTCCGATATTTTTAGAAC 
1210 1220 1230 1240 1250 1260 



ATGAGGTATGGAGCTAAACCTGTAGTTACTGAATGCACAACTTCTTGTCTGCATGATGCG 
TACTCCATACCTCGATTTGGACATCAATGACTTACGTGTTGAAGAACAGACGTACTACGC 
1270 1280 1290 1300 1310 1320 



GTGTTGAGAGACGACTACAAAATAGTGAAAGATCTGTTGAAGAATAACTATGTAAACAAT 
CACAACTCTCTGCTGATGTTTTATCACTTTCTAGACAACTTCTTATTGATACATTTGTTA 
1330 1340 1350 1360 1370 1380 



GTTCTTTACAGCGGAGGCTTTACTCCTTTGTGTTTGGCAGCTTACCTTAACAAAGTTAAT 
CAAGAAATGTCGCCTCCGAAATGAGGAAACACAAACCGTCGAATGGAATTGTTTCAATTA 
1390 1400 1410 1420 1430 1440 



TTGGTTAAACTTCTATTGGCTCATTCGGCGGATGTAGATATTTCAAACACGGATCGGTTA 
AACCAATTTGAAGATAACCGAGTAAGCCGCCTACATCTATAAA6TTTGTGCCTAGCCAAT 
1450 1460 1470 1480 1490 1500 



ACTCCTCTACATATAGCCGTATCAAATAAAAATTTAACAATGGTTAAACTTCTATTGAAC 
TGAGGAGATGTATATCGGCATAGTTTATTTTTAAATTGTTACCAATTTGAAGATAACTTG 
1510 1520 1530 1540 1550 1560 



AAAGGTGCTGATACTGACTTGCTGGATAACATGGGACGTACTCCTTTAATGATCGCTGTA 
TTTCCACGACTATGACTGAACGACCTATTGTACCCTGCATGAGGAAATTACTAGCGACAT 
1570 1580 1590 1600 1610 1620 



CAATCTGGAAATATTGAAATATGTAGCACACTACTTAAAAAAAATAAAATGTCCAGAACT 
GTTAGACCTTTATAACTTTATACATCGTGTGATGAATTTTTTTTATTTTACAGGTCTTGA 
1630 1640 1650 1660 1670 1680 



GGGAAAAATTGATCTTGCCAGCTGTAATTCATGGTAGAAAAGAAGTGCTCAGGCTACTTT 
CCCTTTTTAACTAGAACGGTCGACATTAAGTACCATCTTTTCTTCACGAGTCCGATGAAA 
1690 1700 1710 1720 1730 1740 



TCAACAAAGGAGCAG 
AGTTGTTTCCTCGTC 
1750 



ATGTAAACTACATCTTTGAAAGAAATGGAAAATCATATACTGTTT 
TACATTTGATGTAGAAACTTTCTTTACCTTTTAGTATATGACAAA 
1760 1770 1780 1790 1800 



TGGAATTGATTAAAGAAAGTTACTCTGAGACACAAAAGAGGTAGCTGAAGTGGTACTCTC 
ACCTTAACTAATTTCTTTCAATGAGACTCTGTGTTTTCTCC ATCGACTTCACCATGAGAG 
1810 1820 1830 1840 1850 1860 



C5 CRF 

MErGI^A£NASPASPC»SGUJAIAARGSERAEGGUJII£7mi£UT^ASPEf^ 
AAAATGCAGAACGATGACTGCGAAGCAAGAAGTAGAGAAATAACACTTTATGACTTTCTT 
T T T T ACGTCTTGCTACTGACGCTTCGTTCTTCATCTCTTTATTGTGAAATACTGAAAGAA 

\ 1870 1880 1890 1900 1910 1920 

CCATGCACTGATTAATCGAfTATTTTTllCCATGOGCCC (C5B1) 



SER as ARG LYS ASP ARG ASP II£ MET MET 
A6TTGTAGAAAAGATAGAGATATAATGATG 
TCAACATCTTTTCTATCTCTATATTACTAC 
1930 1940 1950 



VALUEASNASNSERASPIIEAIASERLYS 
GTCATAAATAACTCTGATATTGCAAGTAAA 
CAGTATTTATTGAGACTATAACGTTCATTT 
1960 1970 1980 



OC5ASNASILY5I£UASPI£UEHELY5ARG 
TGCAATAATAAGTTAGATTTATTTAAAAGG 
ACGTTATTATTCAATCTAAATAAATTTTCC 
1990 2000 2010 



IIEWiL^A£N;^I^i;^GUJI£UII£ 
ATAGTTAAAAATAGAAAAAAAGAGTTAATT 
TATCAATTTTTATCTTTTTTTCTCAATTAA 
2020 2030 2040 



CVSAR6VALLY5II£II£HISL!£SII£LBULY5fflEII£ ASJ THR HIS A£N A£t7 ViS ASN 
TGTAGGGTTAAAATAATACATAAGATCTTAAAATTTATAAATACGCATAATAATAAAAAT 
ACATCCCAATTTTATTATGTATTCTAGAATTTTAAATATTTATGCGTATTATTATTTTTA 
2050 2060 2070 2080 2090 2100 



(C5C1) |GGATCdCGGGrrTTTTA']jGACTAGTTAATCACGGCCQ 

ARGI£UTV!RI£UIfiJEroSERC3iJII^:LYSE«ELYSI^ 
AGATTATACTTATTACCTTCAGAGATAAAATTTAAGATATTTACTTATTTAACTTATAAA 



TCTAATATGAATAATGGAAGTCTCTATTTTAAATTCTATAAATGAATAAATTGAATATTT 
2110 2120 2130 2140 2150 2160 

ASP LBJ LYS as HE HE SER UiS *** 
GATCTAAAATGCAT AATTTCTAAATAATGAAAAAAAGTACAT CATGAGCAACGCGTTAGT 
CTAGATTTTACGTATTAAAGATTTATTACTTTTTTTCATGTAGTACTCGTTGCGCAATCA 
2170 2180 2190 2200 2210 2220 



ATATTTTACAATGGAGATTAA 
TATAAAATGTTACCTCTAATT 
2230 2240 



CGCTCTATACCGTTCTATG 
GCGAGATATGGCAAGATAC 
2250 2260 



TTTATTGATTCAGATGATGT 
AAATAACTAAGTCTACTACA 
2270 2280 



TTTAGAAAAGAAAGTTATTGAATATGAAAACTTTAATGAAGATGAAGATGACGACGATGA 
AAATCTTTTCTTTCAATAACTTATACTTTTGAAATTACTTCTACTTCTACTGCTGCTACT 
2290 2300 2310 2320 2330 2340 



TTATTGTTGTAAATCTGTTTTAGATGAAGAAGATGACGCGCTAAAGTATACTATGGTTAC 
AATAACAACATTTAGACAAAATCTACTTCTTCTACTGCGCGATTTCATATGATACCAATG 
2350 2360 2370 2380 2390 2400 



AAAGTATAAGTCTATACTACTAATGGCGACTTGTGCAAGAAGGTATAGTATAGTGAAAAT 
TTTCATATTCAGATAT6ATGATTACCGCTGAACACGTTCTTCCATATCATATCACTTTTA 
2410 2420 2430 2440 2450 2460 



GTTGTTAGATTATGATTATGAAAAACCAAATAAATCAGATCCATATCTAAAGGTATCTCC 
CAACAATCTAATACTAATACTTTTTGGTTTATTTAGTCTAGGTATAGATTTCCATAGAGG 
2470 2480 2490 2500 2510 2520 



TTTGCACATAATTTCATCTATTCCTAGTTTAGAATACTTTTCATTATATTTGTTTACAGC 
AAACG TG TATTAAAG TAGATAAGGATCAAATCTTATG AAAAGTAATATAAACAAATGT CG 
2530 2540 2550 \ 2560 2570 2580 

\ G ACGTCGG {C5D1 ) 



TGAAGACGAAAAAAATATATCGATAATAGAAGATTATGTTAACTCTGCTAATAAGATGAA 
ACTTCTGCTTTTTTTATATAGCTATTATCTTCTAATACAATTGAGACGATTATTCTACTT 
2590 2600 2610 2620 2630 2640 



ATTGAATGAGTCTGTGATAATAGCTATAATCAGAGAAGTTCTAAAAGGAAATAAAAATCT 
TAACTTACTCAGACACTATTATCGATATTAGTCTCTTCAAGATTTTCCTTTATTTTTAGA 
2650 2660 2670 2680 2690 2700 



AACTGATCAGGATATAAAAACATTGGCTGATGAAATCAACAAGGAGGAACTGAATATAGC 
TTGACTAGTCCTATATTTTTGTAACCGACTACTTTAGTTGTTCCTCCTTGACTTATATCG 
2710 2720 2730 2740 2750 2760 



TAAACTATTGTTAGATAGAGGGGCCAAAGTAAATTACAAGGATGTTTACGGTTCTTCAGC 
ATTTGATAACAATCTATCTCCCCGGTTTCATTTAATGTTCCTACAAATGCCAAGAAGTCG 
2770 2780 2790 2800 2810 2820 



TCTCCATAGAGCTGCTATTGGTAGGAAACAGGATATGATAAAGCTGTTAATCGATCATGG 
AGAGGTATCTCGACGATAACCATCCTTTGTCCTATACTATTTCGACAATTAGCTAGTACC 
2830 2840 2850 2860 2870 2880 



AGCTGAT 
TCG ACT A 



GT AAACTC 
C ATTTGAG 
2890 



TTTAACTATTGCTAAAGATAATCTTATTAAAAAAAAATAATATCA 
AAATTGATAACGATTTCTATTAGAATAATTTTTTTTTATTATAGT 
2900 2910 2920 2930 2940 



CGTTTAGTAATATTAAAATATATTAATAACTCTATTACTAATAACTCCAGTGGATATGAA 
GCAAATCATTATAATTTTATATAATTATTGAGATAATGATTATTGAGGTCACCTATACTT 
2950 2960 2970 2980 2990 3000 



CATAATACGAAGTTTATACATTCTCATCAAAATCTTATTGACATCAAGTTA6ATTGTGAA 
GTATTATGCTTCAAATATGTAAGAGTAGTTTTAGAATAACTGTAGTTCAATCTAACACTT 
3010 3020 3030 3040 3050 3060 



AATGAGATTATGAAATTAAGGAATACAAAAATAGGATGTAAGAACTTACTA6AATGTTTT 
TTACTCTAATACTTTAATTCCTTATGTTTTTATCCTACATTCTTGAATGATCTTACAAAA 
3070 3080 3090 3100 3110 3120 



ATCAATAATGATATGAATACAGTATCTAGGGCTATAAACAATGAAACGATTAAAAATTAT 
TAGTTATTACTATACTTATGTCATAGATCCCGATATTTGTTACTTTGCTAATTTTTAATA 
3130 3140 3150 3160 3170 3180 



AAAAATCATTTCCCTATATATAATACGCTCATAGAAAAATTCATTTCTGAAAGTATACTA 
TTTTTAGTAAAGGGATATATATTATGCGAGTATCTTTTTAAGTAAAGACTTTCATATGAT 
3190 3200 3210 3220 3230 3240 



AGACACGAATTATT66ATGGAGTTATAAATTCTTTTCAAGGATTCAATAATAAATTGCCT 
TCTGTGCTTAATAACCTACCTCAATATTTAAGAAAAGTTCCTAAGTTATTATTTAACGGA 
3250 3260 3270 3280 3290 3300 



TACGAGATTCAGTACATTATACTGGAGAATCTTAATAACCATGAACTAAAAAAAATTTTA 
ATGCTCTAAGTCATGTAATATGACCTCTTAGAATTATTGGTACTTGATTTTTTTTAAAAT 
3310 3320 3330 3340 3350 3360 



GATAATATACATTAAAAAGGTAAATAGATCATCTGTTATTATAAGCAAAGATGCTTGTTG 
CTATTATATGTAATTTTTCCATTTATCTAGTAGACAATAATATTCGTTTCTACGAACAAC 
3370 3380 3390 3400 3410 3420 



CCAATAATATACAACAG 
GGTTATTATATGTTGTC 
3430 



GTATTTGTTTTTATTTTTAA 
CATAAACAAAAATAAAAATT 
3440 3450 



CTACATATTT 
GATGTATAAA 
3460 



GATGTTCATTCTC 
CT AC AAGT AAG AG 
3470 3480 



TTTATATAGTATACACAGAAAATTCATAATCCACTTAGAATTTCTAGTTATCTAGTTTTT 
AAATATATCATATGTGTCTTTTAAGTATTAGGTGAATCTTAAAGATCAATAGATCAAAAA 
3490 3500 3510 3520 3530 3540 



CTAGAATATTGTACTTTATTTCTAATGGAATGGCTCTCCAGCCTAGTAATTTATTAATGT 
GATCTTATAACATGAAATAAAGATTACCTTACCGAGAGGTCGGATCATTAAATAATTACA 
3550 3560 3570 3580 3590 3600 



TAGCTGATATCTTGAAATCAGGATATTCTGCTCCGTGAAGAGAAAGTCCTCCAAAGTTGT 
ATCGACTATAGAACTTTAGTCCTATAAGACGAGGCACTTCTCTTTCAGGAGGTTTCAACA 
3610 3620 3630 3640 3650 3660 



ATATTTCCATCACTTTCATGGCTTCCTCTTTCCATAGTGTCTTCTATAAGCTGTCTATAT 
TATAAAGGTAGTGAAAGTACCGAAGGAGAAAGGTATCACAGAA6ATATTCGACAGATATA 
3670 3680 3690 3700 3710 3720 



ATTGTAAACTTTTCTGGTTTTATGCATTTTAAACATTTAGCAATCTCATTTTCATCACAA 
TAACATTT6AAAAGACCAAAATACGTAAAATTTGTAAATCGTTAGAGTAAAAGTAGTGTT 
3730 3740 3750 3760 3770 3780 



TTAAGGCACAAATCTAACATGGAATGTCTACCATAACCCAATAAGGTTTTTTCATTTCCT 
AATTCCGTGTTTA6ATTGTACCTTACAGATGGTATTGGGTTATTCCAAAAAAGTAAAGGA 
3790 3800 3810 3820 3830 3840 



CTATCTCTAATACAPACTGTTCTTTCCAGACTTTCAACACGCTGCTATTTTCTATTTTAT 
GATAGAGATTATGTGTGACAA6AAAGGTCTGAAAGTTGTGCGACGATAAAAGATAAAATA 
3850 3860 3870 3880 3890 3900 



TCAAGTCCATATTATAAGCGTCCTTGTTAGACACTTCATAATGTTTGCATTCTGGAATCA 
AGTTCAGGTATAATATTCGCAGGAACAATCTGTGAAGTATTACAAACGTAAGACCTTAGT 
3910 3920 3930 3940 3950 3960 



TCATGTTAGATACTATTAATTTAGCTACTTCTATGTTGTCATCAAAAGAGTTGCTATCTG 
AGTACAATCTATGATAATTAAATCGATGAAGATACAACAGTAGTTTTCTCAACGATAGAC 
3970 3980 3990 4000 4010 4020 



TAATTACACTAAGAGGTGTATCACCTGATAAAGAAGTAATAGAGACATCTGCTCTGAATT 
ATTAATGTGATTCTCCACATAGTGGACTATTTCTTCATTATCTCTGTAGACGAGACTTAA 
4030 4040 4050 4060 4070 4080 



TAAGCAATACCTCAATAACTTCTTTTGAAGATGACTTTGCAGCTAAAAATAATGGAGTTC 
ATTCGTTATGGAGTTATTGAAGAAAACTTCTACTGAAACGTCGATTTTTATTACCTCAAG 
4090 4100 4110 4120 4130 4140 



TCTCTAAAACATCCCTTGAGTTTATATTAGCCCCGTAACTAATGAGTAGTTCTGTTATAT 
AGAGATTTTGTAGGGAACTCAAATATAATCGGGGCATTGATTACTCATCAAGACAATATA 
4150 4160 4170 4180 4190 4200 



CTTTGGAATCTATTGATATATTATTAATTACAATTGTCATGCTGACATATATAGACATCA 
GAAACCTTAGATAACTATATAATAATTAATGTTAACAGTACGACTGTATATATCTGTAGT 
4210 4220 4230 4240 4250 4260 



TAATATGATGAAAAATATGAAAATATAAGTGCACGTTTACTGTTACTATGATTGTGATAT 
ATTATACTACTTTTTATACTTTTATATTCACGTGCAAATGACAATGATACTAACACTATA 
4270 4280 4290 4300 4310 4320 



CGATATGAGTTCTTTAATAAAAGTACTGAAATAGATATAATGCAGATATGATTGATATTT 
GCTATACTCAAGAAATTATTTTCATGACTTTATCTATATTACGTCTATACTAACTATAAA 
4330 4340 4350 4360 4370 4380 



TAAAAAGTTGAAAAAAAATATGCCCTGTTTACAAATACTATTTGGAAATATTCTGTAATA 
ATTTTTCAACTTTTTTTTATACGGGACAAATGTTTATGATAAACCTTTATAAGACATTAT 
4390 4400 4410 4420 4430 4440 



AAGTAATAGTGATATGTCAGTCACGATGGATTTGCCAATTGATCATATGAGTATAGATAA 
TTCATTATCACTATACAGTCAGTGCTACCTAAACGGTTAACTAGTATACTCATATCTATT 
4450 4460 4470 4480 4490 4500 



CATAAACGAGTATAATAAAAATGGATATACTAGACTCTATATAGAGGTAGCCATGAAAAA 
GTATTTGCTCATATTATTTTTACCTATATGATCTGAGATATATCTCCATCGGTACTTTTT 
4510 4520 4530 4540 4550 4560 



ACGTAAAA AC GTAGATAGACTTTTATATCTCGGAGCTGATCCGAATCTGGCTAGTGTAGA 
TGCATTTTTGCATCTATCTGAAAATATAGAGCCTCGACTAGGCTTAGACCGATCACATCT 
4570 4580 4590 4600 4610 4620 



TTCGTATTGTCCTCTTCATATTGCTGTTAGGAATGGTAGTTTAAAGATAATAAGATCATT 
AAGCATAACAGGAGAAGTATAACGACAATCCTTACCATCAAATTTCTATTATTCTAGTAA 
4630 4640 4650 4660 4670 4680 



I. tir* ti .V 



GTTGAAATATGGTGCTAATATAAATCAAGAATGTCATGAAGGAGATACTGCTTTGATGAT 
CAACTTTATACCACGATTATATTTAGTTCTTACAGTACTTCCTCTATGACGAAACTACTA 
4690 4700 4710 4720 4730 4740 



GGCTATATCATTAGGTAATTATACAGCATGTAAAACACTTCTAGATAACAACGCCGATCC 
CCGATATAGTAATCCATTAATATGTCGTACATTTTGTGAAGATCTATTGTTGCGGCTAGG 
4750 4760 4770 4780 4790 4800 



TAATTATGTTAACTATTACGGTATAGTTCCGCTTATTAGAGCAATTATATGTGAAAAGCC 
ATTAATACAATTGATAATGCCATATCAAGGCGAATAATCTCGTTAATATACACTTTTCGG 
4810 4820 4830 4840 4850 4860 



TGACATAGTTAGACTGCTATTAGATAGAGGAGCTAATTGCAACCACTTAATTACAAAAAA 
ACTGTATCAATCTGACGATAATCTATCTCCTCGATTAACGTTGGTGAATTAATGTTTTTT 
4870 4880 4890 4900 4910 4920 



CGGTAGAACCTATACTGCTTTAGAGAGTCTTAGGAATTGCTTTTTTAAAGACAATTCTTC 
GCCATCTTGGATATGACGAAATCTCTCAGAATCCTTAACGAAAAAATTTCTGTTAAGAAG 
4930 4940 4950 4960 4970 4980 



ATCATTGTCGATACTAATAT 
TAGTAACAGCTATGATTATA 
4990 5000 



Figure 14. VQH6 amplified fragment (SEQ ID NO:76) 

VQ marker 



BamH I 



V 



Q G E N E 



AAAGGATCCGGGTTAATTAATTAGTCATCAGGCAGGGCGAGAACGAGACT 



TTTCCTAGGCCCAATTAATTAATCAGTAGTCCGTCCCGCTCTTGCTCTGA 



ICS*** => H6p 

ATCTGCTCGTTAATTAATTAGAGCTTCTTTATTCTATACTTAAAAAGTGA 
TAGACGAGCAATTAATTAATCTCGAAGAAATAAGATATGAATTTTTCACT 



AAATAAATACAAAGGTTCTTGAGGGTTGTGTTAAATTGAAAGCGAGAAAT 
TTTATTTATGTTTCCAAGAACTCCCAACACAATTTAACTTTCACTCTTTA 



ECOR V 

aatcataaattatttcattatcgc jgatatcl cgttaagtttgtatcgtagg 
ttagtatttaataaagtaatagcgctataggcaattcaaacatagcatIc^ 



Kpn I Xho I Xba I Cla I Sina I 
TACC |CTCGAG[ rCTAGA |ATCGAlj cCCGGGTTTT 

ATGG| GAGCTC |AGATC'lt rAGCTA |GGGCCC| AAAA 



Figure 15. Restriction map of pNVQH6C5LSP-18 
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